Automatic Elicitation Compliance For Short-Duration Speech Based Depression Detection

Brian Stasak, Zhaocheng Huang, Dale Joachim, Julien Epps

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:15:32

11 Jun 2021

Detecting depression from the voice in naturalistic environments is challenging, particularly for short-duration audio recordings. This enhances the need to interpret and make optimal use of elicited speech. The rapid consonant-vowel syllable combination ‘pataka’ has frequently been selected as a clinical motor-speech task. However, there is significant variability in elicited recordings, which remains to be investigated. In this multi-corpus study of over 25,000 ‘pataka’ utterances, it was discovered that speech landmark-based features were sensitive to the number of ‘pataka’ utterances per recording. This landmark feature sensitivity was newly exploited to automatically estimate ‘pataka’ count and rate, achieving root mean square errors nearly three times lower than chance-level. Leveraging count-rate knowledge of the elicited speech for depression detection, results show that the estimated ‘pataka’ number and rate are important for normalizing evaluative ‘pataka’ speech data. Count and/or rate normalized ‘pataka’ models produced relative reductions in depression classification error of up to 26% compared with non-normalized models.

Chairs:

Mathew Magimai Doss

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021