Warped Language Models For Noise Robust Language Understanding
Mahdi Namazifar, Gokhan Tur, Dilek Hakkani-Tur
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 0:15:16
Masked Language Models (MLM) are self-supervised neural networks trained to fill in the blanks in a given sentence with masked tokens.Despite the tremendous success of MLMs for various text based tasks, they are not robust for spoken language understanding, especially for spontaneous conversational speech recognition noise. In this work we introduce Warped LanguageModels (WLM) in which input sentences at training time go through the same modifications as inMLM, plus two additional modifications, namely inserting and dropping random tokens. These two modifications extend and contract the sentence in addition to the modifications in MLMs, hence the word 鈥渨arped鈥 in the name. The insertion and drop modification of the input text during training of WLM resemble the types of noise due toAutomatic Speech Recognition (ASR) errors, and as a result WLMs are likely to be more robust to ASR noise. Through computational results we show that natural language understanding systems built on top of WLMs perform better compared to those built based on MLMs, especially in the presence of ASR errors.