Skip to main content

SELF-SUPERVISED ACCENT LEARNING FOR UNDER-RESOURCED ACCENTS USING NATIVE LANGUAGE DATA

Mehul Kumar (Samsung Research); Jiyeon Kim (Samsung Research); Dhananjaya Gowda (Samsung Electronics); Abhinav Garg (Stanford); Chanwoo Kim (Samsung Electronics)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

In this paper, we propose a novel method to improve the accuracy of an English speech recognizer for a target accent using the corresponding native language data. Collecting labeled data for all accents of English to train an end-to-end neural speech recognizer for English is a difficult and expensive task. Also, finding a pool of representative English speakers for any arbitrary accent to collect unlabeled data can be a difficult task. However, collecting unlabeled speech data for any native language is a much simpler task. It is important to note that the accents of most non-native English speakers are heavily biased by the co-articulation of sounds in their own native language. In view of this, we propose to use unlabeled native language data to learn self-supervised representations during the pre-training stage. The pre-trained model is then fine-tuned using limited labeled English data for the target accent. Experiments using native language data to pre-train an English recognizer followed by fine-tuning using target accented English show significant improvements in word error rates on four different accents (Great Britain, Korean, Chinese, Spanish).

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00