DATA2VEC-AQC: SEARCH FOR THE RIGHT TEACHING ASSISTANT IN THE TEACHER-STUDENT TRAINING SETUP

Vasista Sai Lodagala (Indian Institute of Technology, Madras); Sreyan Ghosh (University of Maryland, College Park); S Umesh (IIT Chennai)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

09 Jun 2023

In this paper, we propose a new Self-Supervised Learning (SSL) algorithm called data2vec-aqc, for speech representation learning from unlabeled speech data. Our goal is to improve SSL for speech in domains where both unlabeled and labeled data are limited. Building on the recently introduced data2vec, we introduce additional modules to the data2vec framework that leverage the benefit of data augmentations, quantized representations, and clustering. The interaction between these modules helps solve the cross-contrastive loss as an additional self-supervised objective. data2vec-aqc achieves up to 14.1% and 20.9% relative WER improvement over the existing state-of-the-art data2vec system on the test-clean and test-other sets, respectively, of LibriSpeech, without the use of any language model. Our proposed model also achieves up to 17.8% relative WER improvement over the baseline data2vec when fine-tuned on Switchboard data.

Tags:

Robust speech recognition and adaptation

DATA2VEC-AQC: SEARCH FOR THE RIGHT TEACHING ASSISTANT IN THE TEACHER-STUDENT TRAINING SETUP

Vasista Sai Lodagala (Indian Institute of Technology, Madras); Sreyan Ghosh (University of Maryland, College Park); S Umesh (IIT Chennai)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

BENCHMARK OF PHYSIOLOGICAL MODEL BASED AND DEEP LEARNING BASED REMOTE PHOTOPLETHYSMOGRAPHY IN AUTOMOTIVE

FAST AND PARALLEL DECODING FOR TRANSDUCER

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

Join the IEEE Signal Processing Society