Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Zihan Zhao (Shanghai Jiao Tong University); Yu Wang (Shanghai Jiao Tong University); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Multimodal emotion recognition is a challenging research area which aims to fuse different modalities to predict human emotion. However, most existing models that are based on attention mechanisms have difficulty in learning emotionally relevant parts on its own. To solve this problem, we propose to incorporate external emotion-related knowledge in the co-attention based fusion of pre-trained models. In order to effectively incorporate this knowledge, we enhance the co-attention model with a Bayesian attention module (BAM) where a prior distribution is estimated using the emotion-related knowledge. Experimental results on the IEMOCAP dataset show that the proposed approach can outperform a number of state-of-the-art approaches by at least 0.7% unweighted accuracy (UA).