Adaptive Mask Co-optimization for Modal Dependence in Multimodal Learning
Ying Zhou (Xidian University); Xuefeng Liang (Xidian University); ShiQuan Zheng (Xidian University); Huijun Xuan (Xidian University); Takatsune Kumada (Kyoto University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Multimodal learning has demonstrated a great advantage in emotion recognition tasks due to the richer information from different modalities. However, multimodal models may incline to rely on some modalities that are easier to be learned, while under-fit the other modalities and lead to sub-optimal results. To address this problem, we propose a novel plug-in module, Adaptive Mask Co-optimization (AMCo), which could be inserted into advanced models. The adaptive mask can encourage the model to fit other modalities better by making dependent modalities harder to be learned. The co-optimization can preserve the performance of models on dependent modalities without degradation. The extensive experiments on the IEMOCAP dataset show AMCo can improve four SOTA models by 1.14% ~ 3.03% in terms of accuracy.