Skip to main content

ZEPHYR: ZERO-SHOT PUNCTUATION RESTORATION

Minghan Wang (Huawei); Yinglu Li (HUAWEI TECHNOLOGIES CO., LTD.); Jiaxin GUO (Huawei); Xiaosong Qiao (Huawei); Chang Su (Huawei); Min Zhang (Huawei); Shimin Tao (Huawei); Hao Yang (Huawei)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Punctuation restoration can be crucial for the cascade speech translation system. Traditional approaches typically treat it as a sequential tagging problem, predicting which punctuation mark should follow a given word. However, this often requires significant computational and storage resources for full-stage training or fine-tuning. Our argument is that pretrained language models (PLMs) can directly leverage their learned knowledge for punctuation generation, making additional training unnecessary. In this paper, we propose the Zephyr algorithm, which utilizes PLMs to perform zero-shot and few-shot punctuation restoration for both offline and streaming scenarios. Our experimental results demonstrate that, in comparison to fine-tuning-based baselines, Zephyr achieves competitive performance while requiring little to no training cost and exhibiting better generalizability in zero-shot and few-shot settings.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00