Skip to main content

Speech Enhancement with Intelligent Neural Homomorphic Synthesis

Shulin He (College of Computer Science, Inner Mongolia University); Wei Rao (Tencent); Jinjiang Liu (College of Computer Science, Inner Mongolia University); Jun Chen (Tencent); Yukai Ju (Tencent); Xueliang zhang (Inner Mongolia University); Yannan Wang (Tencent); Shi-dong Shang (tencent)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Most neural network speech enhancement models ignore speech production mathematical models by directly mapping Fourier transform spectrums or waveforms. In this work, we propose a neural source filter network for speech enhancement. Specifically, we use homomorphic signal processing and cepstral analysis to obtain noisy speech's excitation and vocal tract. Unlike traditional signal processing, we use an attentive recurrent network (ARN) model predicted ratio mask to replace the liftering separation function. Then two convolutional attentive recurrent network (CARN) networks are used to predict the excitation and vocal tract of clean speech, respectively. The system's output is synthesized from the estimated excitation and vocal. Experiments prove that our proposed method performs better, with SI-SNR improving by 1.363dB compared to FullSubNet.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00