TNTC: two-stream network with transformer-based complementarity for gait-based emotion recognition

Chuanfei Hu, Weijie Sheng, Xinde Li, Bo Dong

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:58

08 May 2022

Recognizing the human emotion automatically from visual characteristics plays a vital role in many intelligent applications. Recently, gait-based emotion recognition, especially gait skeletons-based characteristic, has attracted much attention, while many available methods have been proposed gradually. The popular pipeline is to first extract affective features from joint skeletons, and then aggregate the skeleton joint and affective features as the feature vector for classifying the emotion.However, the aggregation procedure of these emerged methods might be rigid, resulting in insufficiently exploiting the complementary relationship between skeleton joint and affective features. Meanwhile, the long range dependencies in both spatial and temporal domains of the gait sequence are scarcely considered. To address these issues, we propose a novel two-stream network with transformer-based complementarity, termed as TNTC. Skeleton joint and affective features are encoded into two individual images as the inputs of two streams, respectively. A new transformer-based complementarity module (TCM) is proposed to bridge the complementarity between two streams hierarchically via capturing long range dependencies. Experimental results demonstrate that TNTC outperforms state-of-the-art methods on the latest dataset in terms of accuracy.

Tags:

gait-based emotion recognition

complementarity

transformer

convolutional neural network

TNTC: two-stream network with transformer-based complementarity for gait-based emotion recognition

Chuanfei Hu, Weijie Sheng, Xinde Li, Bo Dong

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

Slides: Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

SPATIAL-TEMPORAL TRANSFORMER NETWORK FOR HUMAN MOCAP DATA RECOVERY

Join the IEEE Signal Processing Society