Explainable Audio Classification of Playing Techniques with Layer-wise Relevance Propagation

Changhong Wang (LS2N); Vincent Lostanlen (Cornell Lab of Ornithology); Mathieu Lagrange (LS2N)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Deep convolutional networks (convnets) in the time–frequency domain can learn an accurate and fine-grained categorization of sounds. For example, in the context of music signal analysis, this categorization may correspond to a taxonomy of playing techniques: vibrato, tremolo, trill, and so forth. However, convnets lack an explicit connection with the neurophysiological underpinnings of musical timbre perception. In this article, we propose a data-driven approach to explain audio classification in terms of physical attributes in sound production. We borrow from current literature in “explainable AI” (XAI) to study the predictions of a convnet which achieves an almost perfect score on a challenging task: i.e., the classification of five comparable real-world playing techniques from 30 instruments spanning seven octaves. Mapping the signal into the carrier-modulation domain using scattering transform, we decompose the networks' predictions over this domain with layer-wise relevance propagation. We find that regions highly-relevant to the predictions localized around the physical attributes with which the playing techniques are performed.

Tags:

Music information retrieval and music language processing

Explainable Audio Classification of Playing Techniques with Layer-wise Relevance Propagation

Changhong Wang (LS2N); Vincent Lostanlen (Cornell Lab of Ornithology); Mathieu Lagrange (LS2N)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

TransPlayer: Timbre Style Transfer with Flexible Timbre Control

Phonation Mode Detection in Singing: a Singer Adapted Model

ByteCover3: Accurate Cover Song Identification on Short Queries

Join the IEEE Signal Processing Society