Gated contextual adapters for selective contextual biasing in neural transducers

Anastasios Alexandridis (Amazon.com); Kanthashree Mysore Sathyendra (Amazon); Grant Strimel (Amazon.com); Feng-Ju Chang (Amazon); Ariya Rastrow (Amazon Alexa); Nathan Susanj (Amazon.com); Athanasios Mouchtaris (Amazon Alexa)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Neural contextual biasing for end-to-end neural ASR transducers has shown significant improvements in the recognition of named entities, such as contact names or device names. However, it comes with the cost of increased compute, as the biasing layers (which are usually based on cross-attention) add complexity to the neural transducers. In this paper, we propose gated contextual biasing models that can estimate at runtime when contextual biasing is needed and can toggle it on or off. That way, contextual biasing does not run on every audio frame, but only on the frames where it can be helpful for correct ASR recognition. We show that our gated contextual biasing models can maintain all the performance improvements of contextual biasing while offering significant compute-cost saving, as the contextual biasing needs to be executed for fewer than 15% of the audio frames.

Tags:

Resource constrained speech recognition

Gated contextual adapters for selective contextual biasing in neural transducers

Anastasios Alexandridis (Amazon.com); Kanthashree Mysore Sathyendra (Amazon); Grant Strimel (Amazon.com); Feng-Ju Chang (Amazon); Ariya Rastrow (Amazon Alexa); Nathan Susanj (Amazon.com); Athanasios Mouchtaris (Amazon Alexa)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Papez: Resource-efficient Speech Separation with Auditory Working Memory

Improving Accented Speech Recognition with Multi-Domain Training

DOMAIN AND LANGUAGE ADAPTATION USING HETEROGENEOUS DATASETS FOR WAV2VEC2.0-BASED SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGE

Join the IEEE Signal Processing Society