Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

Jean-Marie Lemercier (Universität Hamburg); Julius Richter (Universität Hamburg); Simon Welker (Universität Hamburg); Timo Gerkmann (Universität Hamburg)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have revently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We show that the generative diffusion approach performs globally on par with the discriminative approach on speech enhancement. We also show that it significantly outperforms discriminative approaches for non-additive corruption models as in the case of dereverberation and bandwidth extension. Code and audio examples can be found online\footnote{https://www.inf.uni-hamburg.de/en/inst/ab/sp/publications/sgmse-multitask.html}.

Tags:

System identification and reverberation reduction

Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

Jean-Marie Lemercier (Universität Hamburg); Julius Richter (Universität Hamburg); Simon Welker (Universität Hamburg); Timo Gerkmann (Universität Hamburg)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Distributed Adaptive Norm Estimation for Blind System Identification in Wireless Sensor Networks

A FREQUENCY-DOMAIN RECURSIVE LEAST-SQUARES ADAPTIVE FILTERING ALGORITHM BASED ON A KRONECKER PRODUCT DECOMPOSITION

SWITCHING KRONECKER PRODUCT LINEAR FILTERING FOR MULTISPEAKER ADAPTIVE SPEECH DEREVERBERATION

Join the IEEE Signal Processing Society