EAD-CONFORMER: A CONFORMER-BASED ENCODER-ATTENTION-DECODER-NETWORK FOR MULTI-TASK AUDIO SOURCE SEPARATION
Chenxing Li, Yang Wang, Feng Deng, Zhuo Zhang, Xiaorui Wang, Zhongyuan Wang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:11:56
In this paper, we propose a Conformer-based network to improve the performance of multi-task audio source separation. This network, named EAD-Conformer, employs Conformer blocks to capture both local and global information, and an encoder-attention-decoder manner encourages the network to perform attentive modeling based on different sources. Specifically, EAD-Conformer first parses out the feature representations from the mixture by a Conformer-based encoder. Then, an attention module extracts selective information for each track and bridges encoder and decoders. Finally, three decoders respectively process attentive features and generate output masks for different sources. In addition, the proposed discriminate loss further enlarges the distance between different sources. Ex- periments demonstrate the effectiveness of EAD-Conformer, which achieves 13.37 dB, 11.41 dB, 10.56 dB signal-to-distortion ratio improvement on speech, music, noise track, respectively, and shows advantages over several well-known models.