Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 12:56
04 May 2020

In this work, we propose a fully convolutional neural network for real-time speech enhancement in the time domain. The proposed network is an encoder-decoder based architecture with skip connections. The layers in the encoder and the decoder are followed by densely connected blocks comprising of dilated and causal convolutions. The dilated convolutions help in context aggregation at different resolutions. The causal convolutions are used to avoid information flow from future frames, hence making the network suitable for real-time applications. We also propose to use sub-pixel convolutional layers in the decoder for upsampling. Further, the model is trained using a loss function with two components; a time-domain loss and a frequency-domain loss. The proposed loss function outperforms the time-domain loss. Experimental results show that the proposed model significantly outperforms other real-time state-of-the-art models in terms of objective intelligibility and quality scores.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00