HEALTHCALL CORPUS AND TRANSFORMER EMBEDDINGS FROM HEALTHCARE CUSTOMER-AGENT CONVERSATIONS
Nikola Lackovic (Malakoff Humanis); Montacié Claude (Sorbonne Université); Cédric Lequilliec (Malakoff Humanis); marie-josé Caraty (Sorbonne Université)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
We present the corpus called HealthCall which was recorded in real-life conditions in the call center of Malakoff Humanis. Records include two separate audio channels, the first one for the customer and the second one for the agent. This corpus includes a transcription of the spoken conversations and was divided into three sets: Train, Devel and Test sets. Two customer relationship management tasks were assessed on the HealthCall corpus: the classification of user requests and the detection of complaints. For this purpose, we have investigated 18 feature sets: 12 linguistic and 6 audio. We have used BERT models for the linguistic features and Wav2Vec model for the audio features. The results show that the linguistic features always give the best results (92.7% for the Request task and 69.0% for the Complaint task) but the concatenation of acoustic and linguistic features allows a slight improvement for the Complaint task (69.4% versus 69%).