Towards Dialogue Modeling Beyond Text
Tongzi Wu (University of Toronto); Yuhao Zhou (Talka AI); Wang Ling (Talka Ai); Hojin Yang (talka ai); Joana Veloso (Talka AI); Lin Sun (Talka AI); Ruixin Huang (Talka AI); Norberto Guimaraes (Talka AI); Scott Sanner (University of Toronto)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In this paper, we model aspects of communication beyond the words that are said. Specifically, we aim to detect interruptions and active listening events, which are important elements in any dialogue. We build a dataset with fine-grained annotations for each category and train multimodal models that take into account all channels in a digital conversation, that is, the video, the audio, and the text. Our experiments show that multimodality is a necessary component in modeling the complexity of the non-textual components of the conversation as different artifacts require different modalities to capture effectively.