Multi-Local Attention for Speech-based Depression Detection
Fuxiang Tao (University of Glasgow); Xuri Ge (University of Glasgow); Wei Ma (University of Glasgow); Anna Esposito (Università di Napol (Italy)); Alessandro. Vinciarelli (UNiversity of Glasgow)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
This article shows that an attention mechanism, the Multi-Local Attention, can improve a depression detection approach based on Long Short-Term Memory Networks. Besides leading to higher performance metrics (e.g., Accuracy and F1 Score), Multi-Local Attention improves two other aspects of the approach, both important from an application point of view. The first is the effectiveness of a confidence score associated to the detection outcome at identifying speakers more likely to be classified correctly. The second is the amount of speaking time needed to classify a speaker as depressed or non-depressed. The experiments were performed over read speech and involved 109 participants (including 55 diagnosed with depression by professional psychiatrists). The results show accuracies up to 88.0% (F1 Score 88.0%).