Improving Speech Enhancement via Event-based Query

Yifei Xin (Peking University); Xiulian Peng (Microsoft Research Asia); Yan Lu (Microsoft Research Asia)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Existing deep learning based speech enhancement (SE) methods either use blind end-to-end training or explicitly incorporate speaker embedding or phonetic information into the SE network to enhance speech quality. In this paper, we perceive speech and noises as different types of sound events and propose an event-based query method for SE. Specifically, speech embeddings that can discriminate speech from noises are first pre-trained with the sound event detection (SED) task. The embeddings are then clustered into fixed golden speech queries, i.e., general but representative speech embeddings, on a diverse clean speech dataset to assist the SE network. The golden speech queries can be obtained offline and generalizable to different SE datasets and networks. Therefore, little extra complexity is introduced and no enrollment is needed for each speaker. Experimental results show that the proposed method yields significant gains compared with baselines and the golden queries are well generalized to different datasets.

Tags:

Detection and classification of acoustic scenes and events

Improving Speech Enhancement via Event-based Query

Yifei Xin (Peking University); Xiulian Peng (Microsoft Research Asia); Yan Lu (Microsoft Research Asia)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

GraphIT: Iterative reweighted l1 algorithm for sparse graph inference in state-space models

Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

AN EFFECTIVE ANOMALOUS SOUND DETECTION METHOD BASED ON REPRESENTATION LEARNING WITH SIMULATED ANOMALIES

Join the IEEE Signal Processing Society