Large Margin Training Improves Language Models For Asr

Jilin Wang, Jiaji Huang, Kenneth Church

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:15

08 Jun 2021

Language models (LM) have been widely deployed in modern ASR systems. The LM is often trained by minimizing its perplexity on speech transcript. However, few studies try to discriminate a ``gold'' reference against inferior hypotheses. In this work, we propose a large margin language model (LMLM). LMLM is a general framework that enforces an LM to assign a higher score to the ``gold'' reference, and a lower one to the inferior hypothesis. The general framework is applied to three pretrained LM architectures: left-to-right LSTM, transformer encoder, and transformer decoder. Results show that LMLM significantly outperforms traditional LMs that are trained by minimizing perplexity. Especially for cases where domain shift exists and more robustness is required. Finally, among the three architectures, transformer encoder achieves the best performance.

Chairs:

Duc Le

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021