Skip to main content

Enhancement Of Coded Speech Using A Mask-Based Post-Filter

Srikanth Korse, Kishan Gupta, Guillaume Fuchs

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 15:00
04 May 2020

The quality of speech codecs deteriorates at low bitrates due to high quantization noise. A post-filter is generally employed to enhance the quality of the coded speech. In this paper, a data-driven postfilter relying on masking in the time-frequency domain is proposed. A fully connected neural network (FCNN), a convolutional encoderdecoder (CED) network and a long short-term memory (LSTM) network are implemeted to estimate a real-valued mask per timefrequency bin. The proposed models were tested on the five lowest operating modes (6.65 kbps-15.85 kbps) of the Adaptive Multi-Rate Wideband codec (AMR-WB). Both objective and subjective evaluations confirm the enhancement of the coded speech and also show the superiority of the mask-based neural network system over a conventional heuristic post-filter used in the standard like ITU-T G.718.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00