Neural Architecture Search for Speech Emotion Recognition

Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:10:22

09 May 2022

Deep neural networks have brought significant advancements to speech emotion recognition (SER). However, the architecture design in SER is mainly based on expert knowledge and empirical (trial-and-error) evaluations, which is time-consuming and resource intensive. In this paper, we propose to apply neural architecture search (NAS) techniques to automatically configure the SER models. To accelerate the candidate architecture optimization, we propose a uniform path dropout strategy to encourage all candidate architecture operations to be equally optimized. Experimental results of two different neural structures on IEMOCAP show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes. The proposed dropout strategy also shows superiority over the previous approaches.

Tags:

neural architecture search

path dropout

uniform sampling

speech emotion recognition

Neural Architecture Search for Speech Emotion Recognition

Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

MENAS: MULTI-TRIAL EVOLUTIONARY NEURAL ARCHITECTURE SEARCH WITH LOTTERY TICKETS

SELECTIVE MULTI-TASK LEARNING FOR SPEECH EMOTION RECOGNITION USING CORPORA OF DIFFERENT STYLES

FRONTEND ATTRIBUTES DISENTANGLEMENT FOR SPEECH EMOTION RECOGNITION

Join the IEEE Signal Processing Society