Text-Independent Speaker Verification With Adversarial Learning On Short Utterances

Kai Liu, Huan Zhou

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 11:16

04 May 2020

A text-independent speaker verification system suffers severe performance degradation under short utterance condition. To address the problem, in this paper, we propose an adversarially learned embedding mapping model that directly map short embedding to enhanced embedding with more discriminability. In particular, a Wasserstein GAN and various alternative loss functions are proposed. These loss function have distinct optimization objectives and some of them are uncommon to the speaker verification research area. Different from most prior studies, our main objective in this study is to investigate the effectiveness of those loss functions by conducting numerous ablation studies. Experiments on Voxceleb dataset verified some of loss functions are beneficial. Additionally, some compelling findings on uncommon loss functions confirm the potential of our study. Lastly, our proposed system, even without any fine-tuning, achieves meaningful advancements over the baseline, with 4% relative improvements on EER and 7% on minDCF for the challenging 2sec-2sec speaker verification.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020