Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 13:56
04 May 2020

Time-domain audio separation networks based on dilated temporal convolutions have recently been shown to perform very well compared to methods that are based on a time-frequency representation in speech separation tasks, even outperforming an oracle binary time-frequency mask of the speakers. This paper investigates the performance of such a time-domain network (Conv-TasNet) for speech denoising in a real-time setting, comparing various parameter settings. Most importantly, different amounts of lookahead are evaluated and compared to the baseline of a fully causal model. We show that a large part of the increase in performance between a causal and non-causal model is achieved with a lookahead of only $20~$ milliseconds, demonstrating the usefulness of even small lookaheads for many real-time applications.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00