Exploiting speaker embeddings for improved microphone clustering and speech separation in ad-hoc microphone arrays

Stijn Kindt (UGent); Jenthe Thienpondt (IDLab, Ghent University); Nilesh Madhu (IDLab, Ghent University - imec)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

For separating sources captured by ad hoc distributed microphones a key first step is assigning the microphones to the appropriate source-dominated clusters. The features used for such (blind) clustering are based on a fixed length embedding of the audio signals in a high-dimensional latent space. In previous work, the embedding was hand-engineered from the Mel frequency cepstral coefficients and their modulation-spectra. This paper argues that embedding frameworks designed explicitly for the purpose of reliably discriminating between speakers would produce more appropriate features. We propose features generated by the state-of-the-art ECAPA-TDNN speaker verification model for the clustering. We benchmark these features in terms of the subsequent signal enhancement as well as on the quality of the clustering where, further, we introduce 2 intuitive metrics for the latter. Results indicate that in contrast to the hand-engineered features, the ECAPA-TDNN-based features lead to more logical clusters and better performance in the subsequent enhancement stages - thus validating our hypothesis.

Tags:

Audio for multimedia and audio processing systems

Exploiting speaker embeddings for improved microphone clustering and speech separation in ad-hoc microphone arrays

Stijn Kindt (UGent); Jenthe Thienpondt (IDLab, Ghent University); Nilesh Madhu (IDLab, Ghent University - imec)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Building Keyword Search System from End-to-End ASR Systems

MUSIC REARRANGEMENT USING HIERARCHICAL SEGMENTATION

Textless Speech-to-Music Retrieval Using Emotion Similarity

Join the IEEE Signal Processing Society