TEMPORAL CONTRASTIVE-LOSS FOR AUDIO EVENT DETECTION

Sandeep Kothinti, Mounya Elhilali

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:13:46

09 May 2022

Temporal coherence is a feature-binding mechanism that ensures features that evolve together in time belong to the same object or event. Coherence has been extensively studied in biological systems, demonstrating how our brain leverages this mechanism to perform complex tasks in real environments and facilitate segregation of complex sensory signals (or wholes) into individual objects (or parts), following Gestalt principles. Though intuitive and computationally tractable, these concepts have rarely been leveraged in audio technologies. Audio event detection is an application that specifically deals with identifying sound events in an audio recording; hence is a natural avenue to explore principles of temporal coherence. In this study, we propose coherence-based learning, formulated as a contrastive loss, to train event detection models whereby embeddings driven by acoustic events are coherently constrained to maximize discriminability across events. This approach results in improved detection performance with no additional computational cost and a very small overhead during the training procedure.

Tags:

contrastive learning

audio event detection

temporal coherence

dcase challenge.

TEMPORAL CONTRASTIVE-LOSS FOR AUDIO EVENT DETECTION

Sandeep Kothinti, Mounya Elhilali

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

SIAMCLIM: TEXT-BASED PEDESTRIAN SEARCH VIA MULTI-MODAL SIAMESE CONTRASTIVE LEARNING

Hybrid Contrastive Prototypical Network for Few-Shot Scene Classification

EXPLORING SELF-SUPERVISED REPRESENTATION LEARNING FOR LOW-RESOURCE MEDICAL IMAGE ANALYSIS

Join the IEEE Signal Processing Society