A Multi-View Approach For Mandarin Non-Native Mispronunciation Verification

Zhenyu Wang, John H.L. Hansen, Yanlu Xie

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 11:29

04 May 2020

Traditionally, the performance of non-native mispronunciation verification systems relied on effective phone-level labelling of non-native corpora. In this study, a multi-view approach is proposed to incorporate discriminative feature representations which requires less annotation for non-native mispronunciation verification of Mandarin. Here, models are jointly learned to embed acoustic sequence and multi-source information for speech attributes and bottleneck features. Bidirectional LSTM embedding models with contrastive losses are used to map acoustic sequences and multi-source information into fixed-dimensional embeddings. The distance between acoustic embeddings is taken as the similarity between phones. Accordingly, examples of mispronounced phones are expected to have a small similarity score with their canonical pronunciations. The approach shows improvement over GOP-based approach by +11.23% and single-view approach by +1.47% in diagnostic accuracy for a mispronunciation verification task.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

A Multi-View Approach For Mandarin Non-Native Mispronunciation Verification

Zhenyu Wang, John H.L. Hansen, Yanlu Xie

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society