A Hierarchical Model For Dialog Act Recognition Considering Acoustic And Lexical Context Information
Yuke Si, Longbiao Wang, Jianwu Dang, Mengfei Wu, Aijun Li
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 11:29
Dialog act recognition (DAR) is important to capture speakers' intention in a dialog system. Traditional methods commonly use the lexical information from transcripts, acoustic information from speech, and dialog context information to do DAR. However, in these methods, textual context information may be considered, whereas acoustic context information is ignored, which leads to ambiguity in certain DAs especially in Mandarin. To solve the problem, we propose a hierarchical model for DAR considering context information of both lexical and acoustic prosody. The experimental results on a Mandarin dialog corpus demonstrate that the contextual-acoustic information is helpful for recognizing DAs. The contextually specific prosodies involved in the utterances such as the echo question and open-end question are beneficial to identify the users' intention. We also investigate the effect of the context length on the DAR. The proper context length is approximately equal to the length of entire subtopics.