THE PCG-AIID SYSTEM FOR L3DAS22 CHALLENGE: MIMO AND MISO CONVOLUTIONAL RECURRENT NETWORK FOR MULTI CHANNEL SPEECH ENHANCEMENT AND SPEECH RECOGNITION
Jingdong Li, Yuanyuan Zhu, Dawei Luo, Yun Liu, Guohui Cui, Zhaoxia Li
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:06:19
This paper described the PCG-AIID system for L3DAS22 challenge in Task 1: 3D speech enhancement in office reverberant environment. We proposed a two-stage framework to address multi-channel speech denoising and dereverberation. In the first stage, a multiple input and multiple output (MIMO) network is applied to remove background noise while maintaining the spatial characteristics of multi-channel signals. In the second stage, a multiple input and single output (MISO) network is applied to enhance the speech from desired direction and post-filtering. As a result, our system ranked 3rd place in ICASSP2022 L3DAS22 challenge, significantly outperforming the baseline system, while achieving 3.2% WER and 0.972 STOI on the blind test-set.