Dual-Path Dilated Convolutional Recurrent Network with Group Attention for Multi-Channel Speech Enhancement
Jiaming Cheng (Southeast University); Cong Pang (Southeast University); Ruiyu Liang (Southeast University); Jingjie Fan (Southeast University); Li Zhao (Southeast University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
This paper proposed a dual-path convolutional recurrent network with group attention for L3DAS23 challenge in Task 1: 3D speech enhancement. We designed a structure based on convolutional encoder-decoder, and frequency-time blocks based on group attention were introduced in the middle. The encoder was used to extract the local representation from the complex spectrum, the correlation along the frequency axis and the time axis were captured through groups of time-frequency processing modules and the key information in the feature flow was extracted by the group attention. As a result, our system ranked the 1st place of the 3D speech enhancement task in ICASSP2023 L3DAS23 challenge and significantly outperformed the baseline, while achieving 0.101 WER and 0.902 STOI on the blind test-set.