SELF-SUPERVISED SPEAKER RECOGNITION WITH LOSS-GATED LEARNING

Ruijie Tao, Ville Hautamäki, Haizhou Li, Kong Aik Lee, Rohan Kumar Das

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:10:51

08 May 2022

In self-supervised learning for speaker recognition, pseudo labels are useful as the supervision signals. It is a known fact that a speaker recognition model doesn't always benefit from pseudo labels due to their unreliability. In this work, we observe that a speaker recognition network tends to model the data with reliable labels faster than those with unreliable labels. This motivates us to study a loss-gated learning (LGL) strategy, which extracts the reliable labels through the fitting ability of the neural network during training. With the proposed LGL, our speaker recognition model obtains a 46.3% performance gain over the system without it. Further, the proposed self-supervised speaker recognition with LGL trained on the VoxCeleb2 dataset without any labels achieves an equal error rate of 1.66% on the VoxCeleb1 original test set.

Tags:

loss-gated learning

pseudo label selection

self-supervised speaker recognition

SELF-SUPERVISED SPEAKER RECOGNITION WITH LOSS-GATED LEARNING

Ruijie Tao, Ville Hautamäki, Haizhou Li, Kong Aik Lee, Rohan Kumar Das

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

Sorry, no results were found

Join the IEEE Signal Processing Society