Low-Latency Lightweight Streaming Speech Recognition With 8-Bit Quantized Simple Gated Convolutional Neural Networks

Jinhwan Park, Xue Qian, Youngmin Jo, Wonyong Sung

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 12:58

04 May 2020

Automatic speech recognition (ASR) is very important for mobile devices. However, deep neural network-based ASR demands a large number of computations, while the memory bandwidth and battery capacity of mobile devices are limited. Server-based implementations are mostly employed, but this increases latency or privacy concerns. Efficient on-device ASR is the solution for these issues. In this paper, we propose a low-latency on-device speech recognition system with a simple gated convolutional network (SGCN). The SGCN shows a competitive recognition accuracy even with 1M parameters. In addition, SGCN is advantageous for parallelization which enables efficient cache utilization. 8-bit quantization is applied to reduce the memory size and computation time. The proposed system features online recognition fulfilling the 0.4s latency limit and operates with the real-time factor of 0.2 using only a single 900MHz CPU core. The system occupying 1.2MB memory footprint shows 19.75% word error rate (WER) with greedy decoding.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Low-Latency Lightweight Streaming Speech Recognition With 8-Bit Quantized Simple Gated Convolutional Neural Networks

Jinhwan Park, Xue Qian, Youngmin Jo, Wonyong Sung

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society