AURA: PRIVACY-PRESERVING AUGMENTATION TO IMPROVE TEST SET DIVERSITY IN SPEECH ENHANCEMENT

xavier gitiaux (Microsoft); Aditya Khant (Microsoft); Ross Cutler ( Microsoft Corporation); Chandan Reddy (Google); Ebrahim Beyrami (Microsoft); Jayant Gupchup (Microsoft)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Speech enhancement models running in production environments are commonly trained on publicly available datasets. However, this approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This `ears-off' situation motivates Aura, an end-to-end solution to make existing speech enhancement train and test sets more challenging and diverse while being sample efficient. Aura is `ears-off' because it relies on a feature extractor and metrics of speech quality, DNSMOS P.835, and AECMOS, that are pre-trained on data obtained from public sources. We apply Aura to evaluate two speech enhancement tasks: noise suppression (NS) and audio echo cancellation (AEC). For the NS task, we augment the INTERSPEECH 2021 DNS challenge test set by sampling audio files from a new batch of noisy speech. For the AEC task, we sample the INTERSPEECH 2021 AEC Challenge dataset. Aura samples an NS test set 0.42 harder in terms of P.835 OVRL than random sampling; and, an AEC test set 1.93 harder in AECMOS. Moreover, Aura increases diversity by 30% for NS tasks and by 530% for AEC tasks compared to greedy sampling. Moreover, Aura achieves a 26% improvement in Spearman's rank correlation coefficient (SRCC) compared to random sampling when used to stack rank NS models.

Tags:

Applications of machine learning

AURA: PRIVACY-PRESERVING AUGMENTATION TO IMPROVE TEST SET DIVERSITY IN SPEECH ENHANCEMENT

xavier gitiaux (Microsoft); Aditya Khant (Microsoft); Ross Cutler ( Microsoft Corporation); Chandan Reddy (Google); Ebrahim Beyrami (Microsoft); Jayant Gupchup (Microsoft)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Joint Cryo-ET Alignment and Reconstruction with Neural Deformation Fields

HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-Wave Radar

FINER-GRAINED DECOMPOSITION FOR PARALLEL QUANTUM MIMO PROCESSING

Join the IEEE Signal Processing Society