Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation

Mengge Liu (Beijing Institute of Technology); Wen Zhang (Xiaomi AI Lab); Xiang Li (Xiaomi AI Lab); Jian Luan (Xiaomi AI Lab); Bin Wang (Xiaomi AI Lab); Yuhang Guo (Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Department of Computer Science and Technology, Beijing Institute of technology); Shuoying Chen (Beijing Institute of Technology)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Simultaneous machine translation (SimulMT) models start translation before the end of the source sentence, making the translation monotonically aligned with the source sentence. However, the general full-sentence translation test set is acquired by offline translation of the entire source sentence, which is not designed for SimulMT evaluation, making us rethink whether this will underestimate the performance of SimulMT models. In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set. Our human evaluation confirms the acceptability of our annotated test set. Evaluations on three different SimulMT models verify that the underestimation problem can be alleviated on our test set. Further experiments show that finetuning on an automatically extracted monotonic training set improves SimulMT models by up to 3 BLEU points.

Tags:

Language resources and systems

Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Unsupervised Out-of-Distribution Detection Using Few In-Distribution Samples

Towards Building Text-To-Speech Systems for the Next Billion Users

G2PL: Lexicon Enhanced Chinese Polyphone Disambiguation using BERT Adapter with a New Dataset

Join the IEEE Signal Processing Society