Pre-Training For Query Rewriting In A Spoken Language Understanding System
Zheng Chen, Xing Fan, Yuan Ling, Lambert Mathias, Chenlei Guo
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:58
Query rewriting (QR) is an increasingly important technique for reducing customer friction resulting from errors in a spoken language understanding pipeline originating from various sources such as speech recognition errors, language understanding errors or entity resolution errors. In this work, we first propose a neural-retrieval based approach for query rewriting. Then, inspired by the wide success of pre-trained contextual language embeddings, we propose a language-modeling (LM) based approach to pre-train query embeddings on historical user conversational data with a voice assistant, as a way to compensate for insufficient QR training data. We also propose to use the NLU hypotheses generated by the language understanding system to augment the pre-training. In experiments we show pre-training on the conversational data achieves strong performance for the QR task. We also show using the NLU hypotheses further benefit the performance. Finally, with pre-training that provides rich prior information, we find a small amount of query-rewrite pairs are enough to make the model outperform a strong baseline fully trained on all QR data.