Decoupling Pronunciation And Language For End-To-End Code-Switching Automatic Speech Recognition

Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:03

09 Jun 2021

Despite the recent significant advances witnessed in end-to-end (E2E) ASR system for code-switching, hunger for audio-text paired data limits the further improvement of the models' performance. In this paper, we propose a decoupled transformer model to use monolingual paired data and unpaired text data to alleviate the problem of code-switching data shortage. The model is decoupled into two parts: audio-to-phoneme (A2P) network and phoneme-to-text (P2T) network. The A2P network can learn acoustic pattern scenarios using large-scale monolingual paired data. Meanwhile, it generates multiple phoneme sequence candidates for single audio data in real time during the training process. Then the generated phoneme-text paired data is used to train the P2T network. This network can be pre-trained with large amounts of external unpaired text data. By using monolingual data and unpaired text data, the decoupled transformer model reduces the high dependency on code-switching paired training data of E2E model to a certain extent. Finally, the two networks are optimized jointly through attention fusion. We evaluate the proposed method on the public Mandarin-English code-switching dataset. Compared with our transformer baseline, the proposed method achieves 18.14\% relative mix error rate reduction.

Chairs:

Karen Livescu

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Decoupling Pronunciation And Language For End-To-End Code-Switching Automatic Speech Recognition

Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

Join the IEEE Signal Processing Society