Neural Noise Embedding For End-To-End Speech Enhancement With Conditional Layer Normalization
Zhihui Zhang, Xiaoqi Li, Yaxing Li, Yuanjie Dong, Dan Wang, Shengwu Xiong
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:00
Most of the deep learning based speech enhancement methods focus on the modeling of complicated relationship between the noisy speech and the clean speech without the consideration of noise information. In order to cope with various complex noise scenes, we introduce a novel enhancement architecture that integrates a deep autoencoder with neural noise embedding. In this study, a new normalization method, termed conditional layer normalization (CLN), is introduced to improve the generalization of deep learning based speech enhancement approaches for unseen environments. The noise embedding is passed through the CLN layers to regularize the network for speech enhancement task. The proposed network can be adaptively adjusted according to different noise information extracted from the noisy speech input. The network in overall is trained in an end-to-end manner and the experimental results show that the proposed scheme produces satisfactory enhancement performance comparing the other methods. The visualization shows that our proposed network captures noise information, which is helpful to improve robustness to unseen environments for speech enhancement.
Chairs:
Ann Spriet