EFFECTIVE TRAINING OF RNN TRANSDUCER MODELS ON DIVERSE SOURCES OF SPEECH AND TEXT DATA

Takashi Fukuda (IBM Research); Samuel Thomas (IBM Research AI)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

This paper proposes a novel modeling framework for effective training of end-to-end automatic speech recognition (ASR) models on various sources of data from diverse domains: speech paired with clean ground truth transcripts, speech with noisy pseudo transcripts from semi-supervised decodes and unpaired text-only data. In our proposed approach, we build a recurrent neural network transducer (RNN-T) model with a shared multimodal encoder, multi-branch prediction networks and a shared common joint network. To train on unpaired text-only data sets along with transcribed speech data, the shared encoder is trained to process both speech and text modalities. Differences in data from multiple domains are effectively handled by training a multi-branch prediction network on various different data sets before an interpolation step combines the multi-branch prediction networks back into a computationally-efficient single branch. We show the benefit of our proposed technique on several ASR test sets by comparing our models to those trained by simple data mixing. The technique provides a significant relative improvement of up to 6% over baseline systems operating at a similar decoding cost.

Tags:

Acoustic modeling for automatic speech recognition

EFFECTIVE TRAINING OF RNN TRANSDUCER MODELS ON DIVERSE SOURCES OF SPEECH AND TEXT DATA

Takashi Fukuda (IBM Research); Samuel Thomas (IBM Research AI)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Lattice-free Sequence Discriminative Training for Phoneme-based Neural Transducers

DELAY-PENALIZED TRANSDUCER FOR LOW-LATENCY STREAMING ASR

Context-aware Fine-tuning of Self-supervised speech models

Join the IEEE Signal Processing Society