Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 14:05
04 May 2020

The recent success of the Transformer based sequence-to-sequence framework for various Natural Language Processing tasks has motivated its application to Automatic Speech Recognition. In this work, we explore the application of Transformers on low resource Indian languages in a multilingual framework. We explore various methods to incorporate language information into a multilingual Transformer, i.e.,(i) at the decoder, (ii) at the encoder. These methods include using language identity tokens or providing language information to the acoustic vectors. Language information to the acoustic vectors can be given in the form of one hot vector or by learning a language embedding. From our experiments, we observed that providing language identity always improved performance. The language embedding learned from our proposed approach, when added to the acoustic feature vector, gave the best result. The proposed approach with retraining gave 6% - 11% relative improvements in character error rates over the monolingual baseline.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00