Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 14:46
04 May 2020

Deep learning based approaches have greatly improved the performance of spoken keyword spotting (KWS). However, KWS of different languages should have their own corresponding modeling units to optimize the performance. In this paper, we propose an end-to-end Mandarin KWS system using Convolutional Recurrent Neural Network with the Connectionist Temporal Classification (CTC) loss function (CRNN-CTC). The tonal syllables are adopted as modeling units. Experiments on AISHELL-2 datasets showed that the proposed approach on the tasks of 13 keywords and 20 keywords can achieve a false rejection rate of 5.35% with 0.26 FA/hour and 6.37% with 0.17 FA/hour, respectively.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00