MINIMUM WORD ERROR TRAINING FOR NON-AUTOREGRESSIVE TRANSFORMER-BASED CODE-SWITCHING ASR

Yizhou Peng, Jicheng Zhang, Hao Huang, Haihua Xu, Eng Siong Chng

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:10:40

11 May 2022

Non-autoregressive end-to-end ASR framework might be potentially appropriate for code-switching recognition task thanks to its inherent property that present output token being independent of historical ones. However, it still under-performs the state-of-the-art autoregressive ASR frameworks. In this paper, we propose various approaches to boosting the performance of a CTC-mask-based non-autoregressive Transformer under code-switching ASR scenario. To begin with, we attempt diversified masking method that are closely related with code-switching point, yielding an improved baseline model. More importantly, we employ Minimum Word Error (MWE) criterion to train the model. One of the challenges is how to generate a diversified hypothetical space, so as to obtain the average loss for a given ground truth. To address such a challenge, we explore different approaches to yielding desired N-best-based hypothetical space. We demonstrate the efficacy of the proposed methods on SEAME corpus, a challenging English-Mandarin code-switching corpus for Southeast Asia community. Compared with the cross-entropy-trained strong baseline, the proposed MWE training method achieves consistent performance improvement on the test sets.

Tags:

transformer

code-switching

non-autoregressive

minimum-word-error

asr

MINIMUM WORD ERROR TRAINING FOR NON-AUTOREGRESSIVE TRANSFORMER-BASED CODE-SWITCHING ASR

Yizhou Peng, Jicheng Zhang, Hao Huang, Haihua Xu, Eng Siong Chng

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICASSP 2022 COURSE 5: Speech Technology for Health: From Technical Foundations to Applications (Parts 1-3)

Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

Slides: Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

Join the IEEE Signal Processing Society