Frame-Level Phoneme-Invariant Speaker Embedding For Text-Independent Speaker Recognition On Extremely Short Utterances

Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Tetsuji Ogawa, Marc Delcroix

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:08

04 May 2020

This paper investigates a phoneme-invariant speaker embedding approach for speaker recognition on extremely short utterances. Intuitively, phonemes are nuisance information for text-independent speaker recognition task since the contents of the speech are usually mismatched between enrolling and testing time. However, many studies have shown that incorporating phoneme information is quite effective to improve the performance of the speaker recognition system. One reasonable explanation for this counter-intuitive result is that the pooling mechanism of segment-based speaker embedding can focus on the specific phonemes which contain rich speaker information, and phoneme information may help this. From this insight, we hypothesize that the pooling mechanism and phoneme-aware training are harmful to extract the speaker embeddings from extremely short utterances. To verify this hypothesis, an adversarial framework is introduced to remove phoneme-variability from the frame-wise speaker embeddings. The experimental results on the Librispeech corpus confirm that our frame-wise, phoneme-adversarial approach outperforms the conventional segment-wise, phoneme-aware approach for short utterances of less than about 1.4 seconds.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Frame-Level Phoneme-Invariant Speaker Embedding For Text-Independent Speaker Recognition On Extremely Short Utterances

Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Tetsuji Ogawa, Marc Delcroix

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society