Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:09:44
10 Jun 2021

This paper proposes a low-latency online extension of wave- U-net for single-channel speech enhancement, which utilizes teacher-student learning to reduce the system latency while keeping high enhancement performance. Wave-U-net is a recently proposed end-to-end source separation method, which achieved remarkable performance in singing voice separation and speech enhancement tasks. Since the enhancement is performed in the time domain, wave-U-net can efficiently model phase information and address the domain transformation limitation, where the time-frequency domain is normally adopted. Intending to apply wave-U-net to face-to-face applications such as hearing aids and in-car communication systems, where a strictly low-latency of less than 10 ms is required, in this paper, we investigate online versions of wave-U-net and propose using teacher-student learning to avoid the performance degradation caused by reducing input segmant length such that the system delay in a CPU is less than 10 ms. The experimental results revealed that the pro- posed model could perform in real-time and low-latency with a high performance of achieving a signal-to-distortion ratio improvement of about 8.35 dB.

Chairs:
Timo Gerkmann

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: Free
    Non-members: Free
  • SPS
    Members: Free
    IEEE Members: $25.00
    Non-members: $40.00
  • SPS
    Members: Free
    IEEE Members: $25.00
    Non-members: $40.00