Skip to main content

AdapITN: A FAST, RELIABLE, AND DYNAMIC ADAPTIVE INVERSE TEXT NORMALIZATION

Binh Thai Nguyen (Karlsruhe Institute of Technology); Duc Minh Nhat Le (Vietnam Artificial Intelligence Solutions); Quang Minh Nguyen (Vietnam Artificial Intelligence Solutions); Quoc Truong Do (Vietnam Artificial Intelligence Solutions); Chi-Mai Luong (ICTLab, University of Science and Technology of Hanoi, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Hanoi, Vietnam.); Alexander Waibel (Karlsruhe Institute of Technology)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Inverse text normalization (ITN) is the task that transforms text in spoken-form into written-form. While automatic speech recognition (ASR) produces text in spoken-form, human and natural language understanding systems prefer to consume text in written-form. ITN generally deals with semiotic phrases (e.g., numbers, date, time). However, lack of studies to deal with phonetization phrases, which is ASR's output when it handles unseen data (e.g., foreign-named entities, domain names), although these exist in the same form in the spoken-form text. The reason is that phonetization phrases are infinite patterns and language-dependent. In this study, we introduce a novel end2end model that can handle both semiotic phrases (SEP) and phonetization phrases (PHP), named AdapITN. We call it "Adap" because it allows for handling unseen PHP. The model performs only when necessary by providing a mechanism to narrow normalized regions and external query knowledge, reducing the runtime significantly.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00