Skip to main content

Spelling-Aware Word-Based End-to-End ASR

Ekaterina Egorova (Brno University of Technology); Hari Krishna Vydana (CERENCE INC.); Lukáš Burget (Brno University of Technology); Jan Honza Cernocky (Brno University of Technology)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
09 Jun 2023

We propose a new end-to-end architecture for automatic speech recognition that expands the “listen, attend and spell” (LAS) paradigm. While the main word-predicting network is trained to predict words, the secondary, speller network, is optimized to predict word spellings from inner representations of the main network (e.g. word embeddings or context vectors from the attention module). We show that this joint training improves the word error rate of a word-based system and enables solving additional tasks, such as out-of-vocabulary word detection and recovery. The tests are conducted on LibriSpeech dataset consisting of 1000 h of read speech.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00