Staged Training Strategy And Multi-Activation For Audio Tagging With Noisy And Sparse Multi-Label Data

Kexin He, Yuhan Shen, Wei-Qiang Zhang, Jia Liu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 15:59

04 May 2020

Audio tagging aims to predict whether certain acoustic events occur in the audio clips. Due to the difficulty and huge cost of obtaining manually labeled data with high confidence, researchers begin to focus on audio tagging using a small set of manually-labeled data, and a larger set of noisy-labeled data. Besides, audio tagging is a sparse multi-label classification task, where only a small number of acoustic events may occur in an audio clip. In this paper, we propose a staged training strategy to deal with the noisy label, and adopt a sigmoid-sparsemax multi-activation structure to deal with the sparse multi-label classification. This paper is an improvement and extension of our previous work for participation in Task 2 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Challenge. We evaluate our methods on the identical task, and achieve state-of-the-art performance, with an lwlrap score of 0.7591 on official evaluation dataset.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Staged Training Strategy And Multi-Activation For Audio Tagging With Noisy And Sparse Multi-Label Data

Kexin He, Yuhan Shen, Wei-Qiang Zhang, Jia Liu

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society