Pagan: A Phase-Adapted Generative Adversarial Networks For Speech Enhancement
Peishuo Li, Zihang Jiang, Shouyi Yin, Dandan Song, Peng Ouyang, Leibo Liu, Shaojun Wei
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 12:59
Deep neural networks (DNNs) are becoming more and more popular in speech enhancement. Most of DNN-based speech enhancement approaches currently operate on magnitude spectra and ignore the phase mismatch between noisy and clean speech which greatly limits the speech enhancement performance. This paper presents a new approach to solve the phase mismatch problem by training traditional DNN adversarially with a time-domain discriminator. Instead of estimating a more accurate phase, the DNN is trained to be more adapted to noisy phase and able to minimize the influence brought by the phase mismatch. We also propose a new evaluation metric to judge the degree of adaptation to noisy phase. Experimental results show that adding of time-domain discriminator yields a more phase-adapted generator and significantly improves the speech enhancement performance.