Acoustic Model Adaptation For Presentation Transcription And Intelligent Meeting Assistant Systems

Yan Huang, Yifan Gong

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 12:13

04 May 2020

We present our solution for unsupervised rapid speaker adaptation in a state-of-art presentation and intelligent meeting transcription system. We adopt the Kullback-Leibler (KL) divergence regularized model adaptation paradigm. For the adaptation architecture, we found that the linear projection layer adaptation yields competitive performance with the additional benefit in its simplicity and robustness to small amount of adaptation data. To address the imperfect supervision, we use a supervision committee formed by multiple systems or single-system n-best to mask possibly mislabeled frames. To relieve the data sparsity issue, we apply noise and speaking rate perturbation data augmentation techniques to create a richer adaptation data set. In summary, the proposed solution consists of the KL-divergence regularized linear projection layer adaptation with frame masking and data augmentation. On a presentation transcription and a meeting transcription task, our proposed methodology yields 7.3% and 7.9% relative word error rate (WER) reduction against a strong baseline model trained from tens of thousand hour speech. To the best of our knowledge, this is a first reported work on rapid speaker adaptation on a state-of-art production system

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Acoustic Model Adaptation For Presentation Transcription And Intelligent Meeting Assistant Systems

Yan Huang, Yifan Gong

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society