Hybrid Autoregressive Transducer (Hat)

Ehsan Variani, David Rybach, Cyril Allauzen, Michael Riley

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:07

04 May 2020

This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a time-synchronous encoder-decoder model that preserves the modularity of conventional automatic speech recognition systems. The HAT model provides a way to measure the quality of the internal language model that can be used to decide whether inference with an external language model is beneficial or not. This article also presents a finite context version of the HAT model that addresses the exposure bias problem and significantly simplifies the overall training and inference. We evaluate our proposed model on a large-scale voice search task. Our experiments show significant improvements in WER compared to the state-of-the-art approaches.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Hybrid Autoregressive Transducer (Hat)

Ehsan Variani, David Rybach, Cyril Allauzen, Michael Riley

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society