AN END-TO-END DEEP LEARNING FRAMEWORK FOR MULTIPLE AUDIO SOURCE SEPARATION AND LOCALIZATION

Yu Chen, Bowen Liu, Zijian Zhang, Hun-Seok Kim

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:21

11 May 2022

Sound source separation and localization for situational awareness enables a wide range of applications such as hearing enhancement and audio beam-forming. We present an end-to-end deep learning framework to separate and localize multiple audio sources from the mixture of multi-channels. The proposed framework jointly estimates the separated sources and their time difference of arrival (TDOA) at different microphones, then it obtains the direction-of-arrival (DOA) for each source. A new structure to reconstruct the mixed signal is introduced for joint optimization of source separation and TDOA estimation. In addition, a discriminator network is added during the training phase to further improve the separation quality. Experiment results demonstrate that the proposed method achieves state-of-the-art accuracy on source separation as well as DOA estimation.

Tags:

discriminator

audio source separation

deep learning

multiple audio source localization

AN END-TO-END DEEP LEARNING FRAMEWORK FOR MULTIPLE AUDIO SOURCE SEPARATION AND LOCALIZATION

Yu Chen, Bowen Liu, Zijian Zhang, Hun-Seok Kim

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Signal Processing and Deep Learning for Practical Active Noise Control

Short Course Bundle: ICASSP 2023 COURSE 2: Graph Signal Processing and Geometric Learning: A Foundational Approach (Parts 1-4)

Short Course Bundle: ICASSP 2023 COURSE 1: A Hands-on Approach for Implementing Stochastic Optimization Algorithms from Scratch (Parts 1-4)

Join the IEEE Signal Processing Society