Maximum A Posteriori Estimator For Convolutive Sound Source Separation With Sub-Source Based Ntf Model And The Localization Probabilistic Prior On The Mixing Matrix

Mieszko Fraś, Konrad Kowalczyk

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:10:14

10 Jun 2021

In this paper we present a method for the separation of sound source signals recorded using multiple microphones in a reverberant room. In particular, we propose a maximum a posteriori (MAP) estimator based on the multichannel nonnegative tensor factorization (NTF) model with the localization prior distribution on the mixing matrix, in which the latent data consists of the so-called sub-sources for an improved performance in a reverberant environment. For the proposed MAP estimator, we derive the sub-source based expectation maximization (EM) algorithm with the multiplicative update rules (MU) and the localization prior distribution (LP) on the mixing matrix (SSEM-MU-LP). We then perform several experiments for speech and instrumental sound sources recorded using two microphones, in determined and under-determined scenarios, and with different types of initialization of the model parameters. The results of these experiments clearly indicate a significant improvement of the proposed algorithm with the localization prior over the state-of-the-art NTF-based source separation algorithms, which can reach up to $50\%$ in the signal-to-distortion ratio.

Chairs:

Jonathan Le Roux

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021