Learning Noise Invariant Features Through Transfer Learning For Robust End-To-End Speech Recognition

Shucong Zhang, Rama Doddipatla, Cong-Thanh Do, Steve Renals

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:41

04 May 2020

End-to-end models yield impressive speech recognition results on clean datasets while having inferior performance on noisy datasets. To address this, we propose transfer learning from a clean dataset (WSJ) to a noisy dataset (CHiME-4) for connectionist temporal classification models. We argue that the clean classifier (the upper layers of a neural network trained on clean data) can force the feature extractor (the lower layers) to learn the underlying noise invariant patterns in the noisy dataset. While training on the noisy dataset, the clean classifier is either frozen or trained with a small learning rate. The feature extractor is trained with no learning rate re-scaling. The proposed method gives up to 15.5% relative character error rate (CER) reduction compared to models trained only on CHiME-4. Furthermore, we use the test sets of Aurora-4 to perform evaluation on unseen noisy conditions. Our method has significantly lower CERs (11.3% relative on average) on all 14 Aurora-4 test sets compared to the conventional transfer learning method (no learning rate re-scale for any layer), indicating our method enables the model to learn noise invariant features.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Learning Noise Invariant Features Through Transfer Learning For Robust End-To-End Speech Recognition

Shucong Zhang, Rama Doddipatla, Cong-Thanh Do, Steve Renals

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society