MMATR: A lightweight approach for Multimodal Sentiment Analysis based on tensor methods

Panagiotis Koromilas (University of Athens); Mihalis A Nicolaou (The Cyprus Institute); Theodoros Giannakopoulos (NCSR Demokritos); Yannis Panagakis (University of Athens)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Despite the considerable research output on Multimodal Learning for Affect-related tasks, most of the current methods are very complex in terms of the number of trainable parameters, and thus do not constitute effective solutions for real-life applications. In this work we try to alleviate this gap in the literature by introducing the Multimodal Attention Tensor Regression (MMATR) network, a lightweight model that is based on: (i) a static input representation (2D matrix of dimensions time $\times$ features) for each modality, which helps to avoid high-parameterized sequential models by incorporating a CNN, (ii) the replacement of the usual pooling and flattening operations as well as the linear layers by tensor contraction and tensor regression layers that are able to reduce the number of parameters, while keeping the high-order structure of the multimodal data, and (iii) a bimodal attention layer that learns multimodal co-occurrences. By a set of experiments comparing with a variety of state-of-the-art techniques, we show that the proposed MMATR can achieve results competitive to the state-of-the-art in the task of Multimodal Sentiment Analysis, albeit having four orders of magnitude fewer parameters.

Tags:

Machine/deep learning methodologies for multimedia

MMATR: A lightweight approach for Multimodal Sentiment Analysis based on tensor methods

Panagiotis Koromilas (University of Athens); Mihalis A Nicolaou (The Cyprus Institute); Theodoros Giannakopoulos (NCSR Demokritos); Yannis Panagakis (University of Athens)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

MRML: Multimodal Rumor Detection by Deep Metric Learning

Abusive activity detection with multi-modality based on convolutional neural network

IMPROVING THE MODALITY REPRESENTATION WITH MULTI-VIEW CONTRASTIVE LEARNING FOR MULTIMODAL SENTIMENT ANALYSIS

Join the IEEE Signal Processing Society