Bangla Voice Command Recognition In End-To-End System Using Topic Modeling Based Contextual Rescoring
Sudipta Saha Shubha, Nafis Sadeq, Shafayat Ahmed, Md. Nahidul Islam, Muhammad Abdullah Adnan
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:20
In this work, we perform contextual rescoring using multi-label topic modeling to improve the performance of an End-to-End Bangla voice command recognition system. We use a hybrid of Connectionist Temporal Classification (CTC) and Attention mechanism in our End-to-End architecture. We use Recurrent Neural Network (RNN) as language model and Labeled LDA (Latent Dirichlet allocation) for contextual rescoring. Our experiments show that our rescoring method reduces Word Error Rate (WER) from 16.7% to 12.8% in Bangla voice command recognition task when the relevant context is provided. The system does not lose any performance when irrelevant context is provided.