Converting Written Language To Spoken Language With Neural Machine Translation For Language Modeling
Shintaro Ando, Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Nobuaki Minematsu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:32
When building a language model (LM) for spontaneous speech, the ideal situation is to have a large amount of spoken, in-domain training data. Having such abundant data, however, is not realistic. We address this problem by generating texts in spoken language from those in written language by using a neural machine translation (NMT) model. We collected faithful transcripts of fully spontaneous speech and corresponding written versions and used them as a parallel corpus to train the NMT model. We used top-k random sampling, which generates a large variety of texts of higher quality as compared to other decoding methods for NMT. Our experimental results show that the NMT model is capable of converting written texts in a certain domain to spoken texts, and that the converted texts are effective for training LMs.