Toroidal Probabilistic Spherical Discriminant Analysis

Anna Silnova ( Brno University of Technology); Niko Brummer (Amazon); Albert DP Swart (Speechly); Lukáš Burget (Brno University of Technology)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring back-ends are commonly used, namely cosine scoring and PLDA. We have recently proposed PSDA, an analog to PLDA that uses Von Mises-Fisher distributions instead of Gaussians. In this paper, we present toroidal PSDA (T-PSDA). It extends PSDA with the ability to model within and between-speaker variabilities in toroidal submanifolds of the hypersphere. Like PLDA and PSDA, the model allows closed-form scoring and closed-form EM updates for training. On VoxCeleb, we find T-PSDA accuracy on par with cosine scoring, while PLDA accuracy is inferior. On NIST SRE'21 we find that T-PSDA gives large accuracy gains compared to both cosine scoring and PLDA.

Tags:

Speaker verification and anti-spoofing

Toroidal Probabilistic Spherical Discriminant Analysis

Anna Silnova ( Brno University of Technology); Niko Brummer (Amazon); Albert DP Swart (Speechly); Lukáš Burget (Brno University of Technology)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Improving Transformer-Based Networks with Locality for Automatic Speaker Verification

Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit

UNIVERSAL SPEAKER RECOGNITION ENCODERS FOR DIFFERENT SPEECH SEGMENTS DURATION

Join the IEEE Signal Processing Society