Structured Sparse Attention For End-To-End Automatic Speech Recognition

Jiabin Xue, Jiqing Han, Tieran Zheng

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 11:24

04 May 2020

The Softmax normalization function-based attention mechanism is often employed by End-to-End Automatic Speech Recognition (E2E ASR) models to tell the network where to focus within the input. However, this mechanism leads to the attention distribution becoming increasingly flatter as the input sequence length increases, since the output probability of this function is dense and nonnegative, which makes it unable to highlight the important information in speech. In this paper, we present two sparse attention mechanisms for ASR tasks with long utterances, which try to improve the attention mechanism by introducing the sparse transformation. First, we propose to replace the Softmax with the Sparsemax that normalizes the attention weight by finding the closest point in the probability simplex. Then, considering the structured characteristics, the pronunciation has a relatively stable duration. Therefore, we further present a structured sparse transformation that forces the networks to pay attention to a continuous segment of speech by applying the l2 penalty. A noniterative solution algorithm that can be used in the backpropagation is designed here. The experiments show that our methods achieve better ASR results compared to a well-tuned attention-based baseline system on a character ASR task.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Structured Sparse Attention For End-To-End Automatic Speech Recognition

Jiabin Xue, Jiqing Han, Tieran Zheng

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society