Weakly-Supervised Sound Event Detection With Self-Attention

Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 16:37

04 May 2020

In this paper, we propose a novel sound event detection (SED) method that incorporates a self-attention mechanism of the Transformer for a weakly-supervised learning scenario. The proposed method utilizes the Transformer encoder, which consists of multiple self-attention modules, allowing to take both local and global context information of the input feature sequence into account. Furthermore, inspired by the great success of BERT in the natural language processing field, the proposed method introduces a special tag token into the input sequence for weak label prediction, which enables the aggregation of the whole sequence information. To demonstrate the performance of the proposed method, we conduct the experimental evaluation using the DCASE2019 Task4 dataset. The experimental results demonstrate that the proposed method outperforms the DCASE2019 Task4 baseline method, which is based on the convolutional recurrent neural network, and the self-attention mechanism effectively works for SED.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Weakly-Supervised Sound Event Detection With Self-Attention

Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society