LEARNING ROBUST SELF-ATTENTION FEATURES FOR SPEECH EMOTION RECOGNITION WITH LABEL-ADAPTIVE MIXUP

Lei Kang (Shantou University); Lichao Zhang (Air Force Engineering University); Dazhi Jiang (Shantou University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Speech Emotion Recognition (SER) is to recognize human emotions in a natural verbal interaction scenario with machines, which is considered as a challenging problem due to the ambiguous human emotions. Despite the recent progress in SER, state-of-the-art models struggle to achieve a satisfactory performance. We propose a self-attention based method with combined use of label-adaptive mixup and center loss. By adapting label probabilities in mixup and fitting center loss to the mixup training scheme, our proposed method achieves a superior performance to the state-of-the-art methods. Our code will be publicly available upon the acceptance.

Tags:

Speech analysis and Language disorder Analysis

LEARNING ROBUST SELF-ATTENTION FEATURES FOR SPEECH EMOTION RECOGNITION WITH LABEL-ADAPTIVE MIXUP

Lei Kang (Shantou University); Lichao Zhang (Air Force Engineering University); Dazhi Jiang (Shantou University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech

REPRESENTATION OF VOCAL TRACT LENGTH TRANSFORMATION BASED ON GROUP THEORY

A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition

Join the IEEE Signal Processing Society