TIME-FREQUENCY AND GEOMETRIC ANALYSIS OF TASK-DEPENDENT LEARNING IN RAW WAVEFORM BASED ACOUSTIC MODELS

Devansh Gupta, Vinayak Abrol

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:19

12 May 2022

End-to-end raw-waveform modelling with learnable feature extraction front-ends has shown promising results in various speech/audio tasks. Despite its varied success, there have not been many attempts to understand how spectral/temporal feature integration from raw inputs helps recognize task-dependent information. Towards this aim, this work presents data-dependent and data-independent methods for understanding the modelling behavior of acoustic models. The first method employs time-frequency analysis to visualize input-specific response spectra as a function of short-time front-end block processing. The second method employs geometric properties of layer-wise weights to quantify the impact of architectural choices on signal propagation and trainability of the model. We demonstrate potential of the proposed methods with help of case studies on speech classification, speaker identification, and spoofing classification tasks.

Tags:

spectral visualization

raw-waveform models

acoustic modelling

mutual coherence

TIME-FREQUENCY AND GEOMETRIC ANALYSIS OF TASK-DEPENDENT LEARNING IN RAW WAVEFORM BASED ACOUSTIC MODELS

Devansh Gupta, Vinayak Abrol

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

NEURAL HMMS ARE ALL YOU NEED (FOR HIGH-QUALITY ATTENTION-FREE TTS)

Join the IEEE Signal Processing Society