Multi-Scale Receptive Field Graph Model for Emotion Recognition in Conversations
JIE WEI (Xi'an Jiaotong University); Guanyu Hu (Xi'an Jiaotong University); Anh Tuan Luu (Nanyang Technological University); Xinyu Yang (Xi'an Jiaotong University); WenJing Zhu (DXM)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Emotion recognition in conversations (ERC) has gained more attention, where contextual information modeling and multimodal fusion have been the focus and challenges in recent years. In this paper, we proposed a Multi-Scale Receptive Field Graph model (MSRFG) to tackle the challenges of ERC. Specifically, MSRFG constructs multi-scale perception graphs and learns contextual information via parallel multi-scale receptive field paths. To compensate for the deficiency of temporal information learning by the graph network, MSRFG injects temporal dependencies into the graph network to model the temporal relationships between utterances. Moreover, to achieve the effective fusion of multimodal information, MSRFG converges the multi-scale features of each modality separately and performs the learning of attention weights after the integration of converged features. We carried out experiments on IEMOCAP and MELD datasets to validate the effectiveness of the proposed method, and the results proved the superiority of our model over the existing SOTA methods. The code is available at https://github.com/Janie1996/MSRFG.