Speech Dereverberation Using Variational Autoencoders

Deepak Baby, Hervé Bourlard

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:14:33

08 Jun 2021

This paper presents a statistical method for single-channel speech dereverberation using a variational autoencoder (VAE) for modelling the speech spectra. One popular approach for modelling speech spectra is to use non-negative matrix factorization (NMF) where learned clean speech spectral bases are used as a linear generative model for speech spectra. This work replaces this linear model with a powerful nonlinear deep generative model based on VAE. Further, this paper formulates a unified probabilistic generative model of reverberant speech based on Gaussian and Poisson distributions. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the VAE and estimating the room impulse response for both probabilistic models. Evaluation results show the superiority of the proposed VAE-based models over the NMF-based counterparts.

Chairs:

Takuya Yoshioka

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021