Spoken language recognition with cluster-based modeling

Stanis?aw Kacprzak, Magdalena Rybicka, Konrad Kowalczyk

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:49

09 May 2022

In this study, we analyze the incorporation of cluster-based modeling into the language recognition systems, in which a single utterance is represented as an embedding, deploying widely used i-vectors and x-vectors. We compare the results obtained with a Cosine Distance Scoring, Gaussian Mixture Model, Logistic Regression, and the Mixture of von Misses-Fisher distributions with the classifiers based on the proposed approach which incorporates cluster-based sub-models. Experimental evaluation is performed on the i-vector embeddings from the NIST 2015 language recognition i-vector machine learning challenge and the x-vector embeddings from the Oriental Language Recognition 2020 Challenge (AP20-OLR). The experimental results clearly show that the proposed approach combined with discriminatively trained Logistic Regression classifier achieves notable improvements over the baseline systems, i.e., those without language sub-models, and that our approach is competitive to other systems reported in the literature.

Tags:

x-vectors

language recognition

ap20-olr challenge

clustering

i-vectors

Spoken language recognition with cluster-based modeling

Stanis?aw Kacprzak, Magdalena Rybicka, Konrad Kowalczyk

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

P5.9-Clustering, GMMs and the k-Means Algorithm

PSEUDO LABELS REFINEMENT WITH INTRA-CAMERA SIMILARITY FOR UNSUPERVISED PERSON RE-IDENTIFICATION

Efficient Transfer by Robust Label Selection and Learning with Pseudo-Labels

Join the IEEE Signal Processing Society