3D audio signal processing systems for speech enhancement and sound localization and detection

Jisheng Bai (School of Marine Science and Technology, Northwestern Polytechnical University); Siwei Huang (JLESS); Han Yin (JLESS); Mou Wang (Northwestern Polytechnical University); Yafei Jia (School of Marine Science and Technology, Northwestern Polytechnical University); Jianfeng Chen (School of Marine Science and Technology, Northwestern Polytechnical University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

10 Jun 2023

The L3DAS23 of ICASSP Signal Processing Grand Challenge encourages research on 3D audio signal processing, such as 3D speech enhancement (SE) and 3D sound localization and detection (SELD). In this paper, we propose a two-stage system based on DPRNN and UNet for the SE task and a Conformer-based system for the SELD task. The proposed SE and SELD systems are evaluated on the L3DAS23 bind test sets. Results show that the proposed methods achieve state-of-the-art performance for 3D SE and SELD.

Tags:

Signal Processing for Communications and Networking

3D audio signal processing systems for speech enhancement and sound localization and detection

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Exploring Language-Agnostic Speech Representations using Domain Knowledge for Detecting Alzheimer's Dementia

An Explanation of Deep MIMO Detection from a Perspective of Homotopy Optimization

Personalized speech enhancement combining band-split RNN and speaker attentive module

Join the IEEE Signal Processing Society