Speech Enhancement Using Masking For Binaural Reproduction Of Ambisonics Signals
Moti Lugasi, Boaz Rafaely
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:38
Speech enhancement in a single channel has been well studied in the literature in many applications. However, in emerging applications such as virtual reality, in addition to attenuating undesired signals, the ability to preserve the spatial information of the desired signal captured in a noisy environment is of great importance. Nevertheless, there are only a few studies in the literature that propose solutions to this challenge. Most of these studies present solutions that attenuate the undesired signals, while preserving only limited spatial information regarding the desired signal. Methods that preserve complete spatial information have only recently been suggested, and have not been studied comprehensively. In this paper, two such methods based on time-frequency masking are investigated with the aim of attenuating the undesired signal, while preserving the spatial components of the desired signal. The first is referred to as spatial masking and is based on masking in the plane wave density domain, and the second on masking in the spherical harmonics (SH) domain. The two methods are compared with a reference method, based on beamforming followed by single-channel time-frequency masking. Objective analysis and two listening tests were conducted in order to evaluate the performance of these methods for speech enhancement.
Chairs:
Prasanga Samarasinghe