End-To-End Speech Translation With Self-Contained Vocabulary Manipulation
Mei Tu, Fan Zhang, Wei Liu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:54
In machine translation, vocabulary manipulation is a way to reduce the target vocabulary based on the source sentence and the word dictionary, which is effective to lower latency during inference for text translation in industrial application. But vocabulary manipulation is hard to apply to the end-to-end speech-text translation, because neither source text nor speech-to-target mapping is available. We introduce a method that avoids this dependence. Through learning the projection between the speech encoder output and the final target vocabulary, the proposed method allows self-contained vocabulary manipulation without knowing source speech transcripts or external dictionaries. Experimental results show that the proposed method speeds up by about 20% while keep the comparable translation quality.