NEURAL CASCADE ARCHITECTURE FOR JOINT ACOUSTIC ECHO AND NOISE SUPPRESSION
Hao Zhang, DeLiang Wang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:11:40
In this paper, we propose a neural cascade architecture for joint acoustic echo and noise suppression. The proposed cascade architecture consists of two modules. A convolutional recurrent network (CRN) is employed in the first module for complex spectral mapping. The output is then fed as an additional input to the second module, where a long short-term memory network (LSTM) is utilized for magnitude mask estimation. The entire architecture is trained in an end-to-end manner with the two modules optimized jointly using a single loss function. The final output is generated using the enhanced phase and magnitude obtained from the first and the second module, respectively. The cascade architecture enables the proposed method to obtain robust magnitude estimation as well as phase enhancement. Evaluation results show that the proposed method effectively suppresses acoustic echo and noise while preserving good speech quality, and significantly outperforms related methods.