EXPLORE RELATIVE AND CONTEXT INFORMATION WITH TRANSFORMER FOR JOINT ACOUSTIC ECHO CANCELLATION AND SPEECH ENHANCEMENT

Xingwei Sun, Chenbin Cao, Qinglong Li, Linzhang Wang, Fei Xiang

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:04

07 May 2022

This paper proposes a joint acoustic echo cancellation (AEC) and speech enhancement method with adaptive filter and deep neural network (DNN) model. A partitioned block adaptive filter is adopted for linear AEC followed by a convolutional neural network and transformer based model to suppress the residual echo, noise, and reverberation. The DNN model has three modules: encoder, dual-path transformer (DPT) and decoder. The encoder is adopted to explore the potential relationships of far-end and near-end signals with the attention mechanism of transformer. The DPT module is further used to explore context information in both time and frequency dimension. The attention mask is used in transformer to realize real-time process. The complex spectra mask is finally estimated by the decoder to recover the target speech. Our proposed DNN model is trained on the ICASSP 2022 AEC Challenge datasets and placed fourth in the challenge with satisfactory performance on subjective and word acceptance rate evaluation.

Tags:

acoustic echo cancellation

speech enhancement

deep neural network

transformer

EXPLORE RELATIVE AND CONTEXT INFORMATION WITH TRANSFORMER FOR JOINT ACOUSTIC ECHO CANCELLATION AND SPEECH ENHANCEMENT

Xingwei Sun, Chenbin Cao, Qinglong Li, Linzhang Wang, Fei Xiang

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICASSP 2022 COURSE 5: Speech Technology for Health: From Technical Foundations to Applications (Parts 1-3)

Audio Signal Enhancement: A Weakly Supervised Deep Learning Approach

Diffusion Models for Speech Enhancement and Restoration

Join the IEEE Signal Processing Society