Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:08:23
10 May 2022

In pronunciation assessment, the assessor?s perception is influenced by a particular pronunciation template. This assessor may hold a bias towards certain variations in pronunciation which do not necessarily impact communication, yet they may be penalized during the assessment. This work proposes a model for pronunciation assessment as the combination of an assessor independent (A) and an assessor specific (B) component. The latter could be interpreted as the assessor bias. The resulting assessment function was implemented as a dual model trained to detect mispronounced speech segments. The models incorporate Long-Short Memory and saliency region selection using attention. An experiment was performed using recordings from young Dutch learners of English as second language, which were annotated for mispronunciation by three trained phoneticians (a1, a2, a3). The models combined were able to detect mispronunciations given the assessor identity achieving F1 scores of 0.77, 0.68 and 0.86 for a1, a2, a3 respectively on the Train set and 0.66, 0.53 and 0.81 on the Test set. Additionally, the attention weights of the B model were able to illustrate disagreements between assessors related to the bias.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00