Crnn-Ctc Based Mandarin Keywords Spotting

Haikang Yan, Qianhua He, Wei Xie

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:46

04 May 2020

Deep learning based approaches have greatly improved the performance of spoken keyword spotting (KWS). However, KWS of different languages should have their own corresponding modeling units to optimize the performance. In this paper, we propose an end-to-end Mandarin KWS system using Convolutional Recurrent Neural Network with the Connectionist Temporal Classification (CTC) loss function (CRNN-CTC). The tonal syllables are adopted as modeling units. Experiments on AISHELL-2 datasets showed that the proposed approach on the tasks of 13 keywords and 20 keywords can achieve a false rejection rate of 5.35% with 0.26 FA/hour and 6.37% with 0.17 FA/hour, respectively.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020