Speech Signal Improvement Using Causal Generative Diffusion Models

Julius Richter (Universität Hamburg); Simon Welker (Universität Hamburg); Jean-Marie Lemercier (Universität Hamburg); Bunlong Lay (Universität Hamburg); Tal Peer (Universität Hamburg); Timo Gerkmann (Universität Hamburg)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

10 Jun 2023

In this paper, we present a causal speech signal improvement system that is designed to handle different types of distortions. The method is based on a generative diffusion model which has been shown to work well in scenarios with missing data and non-linear corruptions. To guarantee causal processing, we modify the network architecture of our previous work and replace global normalization with causal adaptive gain control. We generate diverse training data containing a broad range of distortions. This work was performed in the context of an “ICASSP Signal Processing Grand Challenge” and submitted to the non-real-time track of the “Speech Signal Improvement Challenge 2023”, where it was ranked fifth.

Tags:

Signal Processing for Communications and Networking