Deep Subband Network for Joint Suppression of Echo, Noise and Reverberation in Real-Time Fullband Speech Communication
Feifei Xiong (Alibaba Group); Minya Dong (Alibaba Group); Kechenying Zhou (Alibaba Group); Houwei Zhu (Alibaba Group); Jinwei Feng (Alibaba Group)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
This paper presents a deep and lightweight subband neural network which jointly suppresses the common interference in real-time fullband speech communication: echo, noise and reverberation. Preserving the advantages of spectro-temporal subband network (STSubNet) that requires small amount of resources for good
generalization within a lightweight model, the proposed framework incorporates an adaptive filter and a modified time-domain loss function designed to balance the suppression efficiency among three types of interference. Extensive experimental results show that the proposed loss function significantly improves the residual echo suppression during far-end single talk scenario and balances between distortion to the desired signal and suppression on the undesired signal. In addition, we find that STSubNet requires adaptive filter output (with a better convergence preferred) to be the primary input to achieve a better performance. Competitive performance as compared to state-of-the-art separate models is achieved on three public benchmark test sets from individual echo suppression, denoising and dereverberation area.