Deep Learning Based Prediction Of Hypernasality For Clinical Applications
Vikram C Mathad, Kathy Chapman, Julie Liss, Nancy Scherer, Visar Berisha
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:30
Hypernasality refers to the perception of excessive nasal resonance during the production of oral sounds. Existing methods for automatic assessment of hypernasality from speech are based on machine learning models trained on disordered speech databases rated by speech-language pathologists. However, the performance of such systems critically depends on the availability of hypernasal speech samples and the reliability of clinical ratings. In this paper, we propose a new approach that uses the speech samples from healthy controls to model the acoustic characteristics of nasalized speech. Using healthy speech samples, we develop a 4-class deep neural network classifier for the classification of nasal consonants, oral consonants, nasalized vowels, and oral vowels. We use the classifier to compute nasalization scores for clinical speech samples and show that the resulting scores correlate with clinical perception of hypernasality. The proposed approach is evaluated on the speech samples of speakers with dysarthria and cleft lip and palate speakers.