Sed-Mdd: Towards Sentence Dependent End-To-End Mispronunciation Detection And Diagnosis

Yiqing Feng, Guanyu Fu, Qingcai Chen, Kai Chen

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 17:35

04 May 2020

A mispronunciation detection and diagnosis (MD&D) system typically consists of multiple stages, such as an acoustic model, a language model and a Viterbi decoder. In order to integrate these stages, we propose SED-MDD, an end-to-end model for sentence dependent mispronunciation detection and diagnosis (MD&D) . Our proposed model takes mel-spectrogram and characters as inputs and outputs the corresponding phone sequence. Our experiments prove that SED-MDD can implicitly learn the phonological rules in both acoustic and linguistic features directly from the phonological annotation and transcription in the training data. To the best of our knowledge, SED-MDD is the first model of its kind and it achieves an accuracy of 86.35% and a correctness of 88.61% on L2-ARCTIC which significantly outperforms the existing end-to-end mispronunciation detection and diagnosis (MD&D) model CNN-RNN-CTC.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Sed-Mdd: Towards Sentence Dependent End-To-End Mispronunciation Detection And Diagnosis

Yiqing Feng, Guanyu Fu, Qingcai Chen, Kai Chen

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society