Simple Pooling Front-ends for Efficient Audio Classification

Xubo Liu (University of Surrey); Haohe Liu (University of Surrey); Qiuqiang Kong (Byte Dance); Xinhao Mei (University of Surrey); Mark D. Plumbley (University of Surrey); Wenwu Wang (University of Surrey)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Recently, there has been increasing interest in building efficient audio neural networks for on-device scenarios. Most existing approaches are designed to reduce the size of audio neural networks using methods such as model pruning. In this work, we show that instead of reducing model size using complex methods, eliminating the temporal redundancy in the input audio features (e.g., mel-spectrogram) could be an effective approach for efficient audio classification. To do so, we proposed a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information within the mel-spectrogram. We perform extensive experiments on four audio classification tasks to evaluate the performance of SimPFs. Experimental results show that SimPFs can achieve a reduction in more than half of the number of floating point operations (FLOPs) for off-the-shelf audio neural networks, with negligible degradation or even some improvements in audio classification performance.

Tags:

Audio for multimedia and audio processing systems

Simple Pooling Front-ends for Efficient Audio Classification

Xubo Liu (University of Surrey); Haohe Liu (University of Surrey); Qiuqiang Kong (Byte Dance); Xinhao Mei (University of Surrey); Mark D. Plumbley (University of Surrey); Wenwu Wang (University of Surrey)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Building Keyword Search System from End-to-End ASR Systems

MUSIC REARRANGEMENT USING HIERARCHICAL SEGMENTATION

Textless Speech-to-Music Retrieval Using Emotion Similarity

Join the IEEE Signal Processing Society