IMPROVING SEPARATION-BASED SPEAKER DIARIZATION VIA ITERATIVE MODEL REFINEMENT AND SPEAKER EMBEDDING BASED POST-PROCESSING

Shu-Tong Niu, Jun Du, Lei Sun, Chin-Hui Lee

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:14:56

13 May 2022

In this paper, we propose an iterative separation-based speaker diarization (ISSD) approach to cope with the realistic data conditions. In the proposed ISSD, we iteratively generate adaptation data according to speaker priors and fine-tune the separation model, which leads to a gradual performance improvement. To further reduce some unavoidable speaker detection errors due to some undesirable prior errors using simple ISSD, we utilize speaker embedding information and propose two post-processing techniques, namely, speaker filtering and speaker recovery. We evaluate the diarization performance on the two-speaker conversational telephone speech (CTS) data set from DIHARD-III Challenge. When compared to state-of-the-art clustering-based speaker diarization (CSD) system, the proposed ISSD approach combined with the two post-processing schemes yields a 47.72% and 46.97% relative diarization error rate reduction on the development and evaluation sets, respectively. ISSD is also one key contributing factor to the best-performing system in DIHARD-III Challenge.

Tags:

iteration

speech separation

dihard-iii challenge

speaker diarization

post-processing

IMPROVING SEPARATION-BASED SPEAKER DIARIZATION VIA ITERATIVE MODEL REFINEMENT AND SPEAKER EMBEDDING BASED POST-PROCESSING

Shu-Tong Niu, Jun Du, Lei Sun, Chin-Hui Lee

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Conversational Speech Processing and Recognition: Speech Separation, End-to-End Modeling, and Speaker Diarization

Towards End-to-End Speaker Diarization with Generalized Neural Speaker Clustering

AUXILIARY LOSS OF TRANSFORMER WITH RESIDUAL CONNECTION FOR END-TO-END SPEAKER DIARIZATION

Join the IEEE Signal Processing Society