Using Paralinguistic Information To Disambiguate User Intentions For Distinguishing Phrase Structure And Sarcasm In Spoken Dialog Systems
Zhengyu Zhou, In Gyu Choi, Yongliang He, Vikas Yadav, Chin-Hui Lee
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 0:14:23
This paper aims at utilizing paralinguistic information usually hidden in speech signals, such as pitch, short pause and sarcasm, to disambiguate user intention not easily distinguishable from speech recognition and natural language understanding results provided by a state-of-the-art spoken dialog system (SDS). We propose two methods to address the ambiguities in understanding name entities and sentence structures based on relevant speech cues and nuances. We also propose an approach to capturing sarcasm in speech and generating sarcasm-sensitive responses using an end-to-end neural network. An SDS prototype that directly feeds signal information into the understanding and response generation components has also been developed to support the three proposed applications. We have achieved encouraging experimental results in this initial study, demonstrating the potential of this new research direction.