Journal of Xidian University

Previous Articles     Next Articles

Dysarthria recognition combining speech fusion feature and random forest

LI Dong;ZHANG Xueying;DUAN Shufei;YAN Mimi   

  1. (College of Information Engineering, Taiyuan Univ. of Technology, Taiyuan 030024, China)
  • Received:2017-07-26 Online:2018-06-20 Published:2018-07-18

Abstract:

This paper proposes a method for speech recognition combining the speech fusion feature and random forest to classify normal voices and voices with dysarthria. This work aimes at analyzing the differences about pronunciation between pathological people and normal people, and providing doctors with scientific and objective evidence for diagnosis and treatment. First, the proposed method uses pathological voice database developed by Toronto University as the corpus, then extracts five types of prosodic features and Mel Frequency Cepstrum Coefficient(MFCC), and calculats their statistical features, which composes the fusion feature. Finally, the random forest is used as the classifier. The results show that, compared with the single type of feature, the proposed fusion feature significantly optimizes the recognition performance, and after combining with the random forest, the classification accuracy for male reaches 99.21%, the classification accuracy for female reaches 98.97%, and comprehensive classification accuracy reaches 98.00%. Meanwhile, the research finds that the pronunciation of a patient when he/she speak short words is more accurate than when he/she speaks sentences.

Key words: prosodic feature, Mel frequency cepstrum coefficient, fusion feature, random forest, dysarthria recognition