SAGRNN: SELF-ATTENTIVE GATED RNN FOR BINAURAL SPEAKER SEPARATION WITH INTERAURAL CUE PRESERVATION

Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:16:37

08 May 2022

Most existing deep learning based binaural speaker separation systems focus on producing a monaural estimate for each of the target speakers, and thus do not preserve the interaural cues, which are crucial for human listeners to perform sound localization and lateralization. In this study, we address talker-independent binaural speaker separation with interaural cues preserved in the estimated binaural signals. Specifically, we extend a newly-developed gated recurrent neural network for monaural separation by additionally incorporating self-attention mechanisms and dense connectivity. We develop an end-to-end multiple-input multiple-output system, which directly maps from the binaural waveform of the mixture to those of the speech signals. The experimental results show that our proposed approach achieves significantly better separation performance than a recent binaural separation approach. In addition, our approach effectively preserves the interaural cues, which improves the accuracy of sound localization.

Tags:

null

SAGRNN: SELF-ATTENTIVE GATED RNN FOR BINAURAL SPEAKER SEPARATION WITH INTERAURAL CUE PRESERVATION

Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

PROGRESS-ICASSP 2022: Opening Speech

PROGRESS-ICASSP 2022: Introduction by Farokh Atashzar and Nancy F. Chen

SPARSE ANALYSIS MODEL BASED DICTIONARY LEARNING FOR SIGNAL DECLIPPING

Join the IEEE Signal Processing Society