Training sound event detection with soft labels from crowdsourced annotations

Irene Martin (Tampere University); Manu Harju (Tampere University); Paul Ahokas (Tampere University); Annamaria Mesaros (Tampere University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

In this paper, we study the use of soft labels to train a system for sound event detection (SED). Soft labels can result from annotations which account for human uncertainty about categories, or emerge as a natural representation of multiple opinions in annotation. Converting annotations to hard labels results in unambiguous categories for training, at the cost of losing the details about the labels distribution. This work investigates how soft labels can be used, and what benefits they bring in training a SED system. The results show that the system is capable of learning information about the activity of the sounds which is reflected in the soft labels and is able to detect sounds that are missed in the typical binary target training setup. We also release a new dataset produced through crowdsourcing, containing temporally strong labels for sound events in real-life recordings, with both soft and hard labels.

Tags:

Detection and classification of acoustic scenes and events

Training sound event detection with soft labels from crowdsourced annotations

Irene Martin (Tampere University); Manu Harju (Tampere University); Paul Ahokas (Tampere University); Annamaria Mesaros (Tampere University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

GraphIT: Iterative reweighted l1 algorithm for sparse graph inference in state-space models

Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection

AN EFFECTIVE ANOMALOUS SOUND DETECTION METHOD BASED ON REPRESENTATION LEARNING WITH SIMULATED ANOMALIES

Join the IEEE Signal Processing Society