Text Adaptation For Speaker Verification With Speaker-Text Factorized Embeddings

Yexin Yang, Shuai Wang, Xun Gong, Yanmin Qian, Kai Yu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:12

04 May 2020

Text mismatch between pre-collected data, either training data or enrollment data, and the actual test data can significantly hurt text-dependent speaker verification (SV) system performance. Although this problem can be solved by carefully collecting data with the target speech content, such data collection could be costly and inflexible. In this paper, we propose a novel text adaptation framework to address the text mismatch issue. Here, a speaker-text factorization network is proposed to factorize the input speech into speaker embeddings and text embeddings and then integrate them into a single representation in the later stage. Given a small amount of speaker-independent adaptation utterances, text embeddings of target speech content can be extracted and used to adapt the text-independent speaker embeddings to text-customized speaker embeddings. Experiments on RSR2015 show that text adaptation can significantly improve the performance of text mismatch conditions.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Text Adaptation For Speaker Verification With Speaker-Text Factorized Embeddings

Yexin Yang, Shuai Wang, Xun Gong, Yanmin Qian, Kai Yu

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society