Domain Adaptation for Speaker Recognition in Singing and Spoken Voice

Anurag Chowdhury, Austin Cozzo, Arun Ross

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:55

10 May 2022

In this work, we study the effect of speaking style and audio condition variability between the spoken and singing voice on speaker recognition performance. Furthermore, we also explore the utility of domain adaptation for bridging the gap between multiple speaking styles (singing versus spoken) and improving overall speaker recognition performance. In that regard, we first extend a publicly available singing voice dataset, JukeBox, with corresponding spoken voice data and refer to it as JukeBox-V2. Next, we use domain adaptation for developing a speaker recognition method robust to varying speaking styles and audio conditions. Finally, we analyze the speech embeddings of domain-adapted models to explain their generalizability across varying speaking styles and audio conditions.

Tags:

domain adaptation

singing voice

speaker recognition

speaking style

deep learning

Domain Adaptation for Speaker Recognition in Singing and Spoken Voice

Anurag Chowdhury, Austin Cozzo, Arun Ross

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICASSP 2023 COURSE 2: Graph Signal Processing and Geometric Learning: A Foundational Approach (Parts 1-4)

Short Course Bundle: ICASSP 2023 COURSE 1: A Hands-on Approach for Implementing Stochastic Optimization Algorithms from Scratch (Parts 1-4)

Audio Signal Enhancement: A Weakly Supervised Deep Learning Approach

Join the IEEE Signal Processing Society