The Benefit Of Temporally-Strong Labels In Audio Event Classification

Shawn Hershey, Daniel P. W. Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R Channing Moore, Manoj Plakal

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:15:10

09 Jun 2021

To reveal the importance of temporal precision in ground truth audio event labels, we collected precise (∼0.1 sec resolution) “strong” labels for a portion of the AudioSet dataset. We devised a temporallystrong evaluation set (including explicit negatives of varying difficulty) and a small strong-labeled training subset of 67k clips (compared to the original dataset’s 1.8M clips labeled at 10 sec resolution). We show that fine-tuning with a mix of weak- and stronglylabeled data can substantially improve classifier performance, even when evaluated using only the original weak labels. For a ResNet50 architecture, d' on the strong evaluation data including explicit negatives improves from 1.13 to 1.41. The new labels are available as an update to AudioSet.

Chairs:

Romain Serizel

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

The Benefit Of Temporally-Strong Labels In Audio Event Classification

Shawn Hershey, Daniel P. W. Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R Channing Moore, Manoj Plakal

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Welcome and Opening Remarks for the IEEE SustainTech Leadership Forum

Panel: Building Sustainable Cities for Tomorrow

Panel: Unleashing the Potential of Virtual Power Plants for Sustainable Energy Solutions

Join the IEEE Signal Processing Society