ZEPHYR: ZERO-SHOT PUNCTUATION RESTORATION
Minghan Wang (Huawei); Yinglu Li (HUAWEI TECHNOLOGIES CO., LTD.); Jiaxin GUO (Huawei); Xiaosong Qiao (Huawei); Chang Su (Huawei); Min Zhang (Huawei); Shimin Tao (Huawei); Hao Yang (Huawei)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Punctuation restoration can be crucial for the cascade speech translation system. Traditional approaches typically treat it as a sequential tagging problem, predicting which punctuation mark should follow a given word. However, this often requires significant computational and storage resources for full-stage training or fine-tuning. Our argument is that pretrained language models (PLMs) can directly leverage their learned knowledge for punctuation generation, making additional training unnecessary. In this paper, we propose the Zephyr algorithm, which utilizes PLMs to perform zero-shot and few-shot punctuation restoration for both offline and streaming scenarios. Our experimental results demonstrate that, in comparison to fine-tuning-based baselines, Zephyr achieves competitive performance while requiring little to no training cost and exhibiting better generalizability in zero-shot and few-shot settings.