DOMAIN AND LANGUAGE ADAPTATION USING HETEROGENEOUS DATASETS FOR WAV2VEC2.0-BASED SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGE

Kak Soky (Kyoto University); Sheng Li (National Institute of Information & Communications Technology (NICT)); Chenhui Chu (Kyoto University); Tatsuya Kawahara (Kyoto University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

We address the effective finetuning of a large-scale pre-trained model for automatic speech recognition of low-resource languages with only a one-hour matched dataset. The finetuning is composed of domain adaptation and language adaptation, and they are conducted by using heterogeneous datasets, which are matched with either domain or language. For effective adaptation, we incorporate auxiliary tasks of domain identification and language identification with multi-task learning. Moreover, the embedding result of the auxiliary tasks is fused to the encoder output of the pre-trained model for ASR. Experimental evaluations on the Khmer ASR using the corpus of ECCC (the Extraordinary Chambers in the Courts of Cambodia) demonstrates that first conducting domain adaption and then language adaption is effective. In addition, multi-tasking with domain embedding gives the best performance, which reduces the baseline CER by 6.47%.

Tags:

Resource constrained speech recognition

DOMAIN AND LANGUAGE ADAPTATION USING HETEROGENEOUS DATASETS FOR WAV2VEC2.0-BASED SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGE

Kak Soky (Kyoto University); Sheng Li (National Institute of Information & Communications Technology (NICT)); Chenhui Chu (Kyoto University); Tatsuya Kawahara (Kyoto University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Papez: Resource-efficient Speech Separation with Auditory Working Memory

Improving Accented Speech Recognition with Multi-Domain Training

Ensemble knowledge distillation of self-supervised speech models

Join the IEEE Signal Processing Society