SW-WaveNet: Learning Representation from Spectrogram and Wavegram Using WaveNet for Anomalous Sound Detection
Haihui Chen (Huazhong University of Science and Technology); Likai Ran (Huazhong University of Science and Technology); Xixia Sun (Nanjing University of Posts and Telecommunications); Chao Cai (Huazhong University of Science and Technology)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Anomalous Sound Detection (ASD) aims to identify whether the sound emitted from a machine is anomalous or not. Most advanced methods use 2-D CNNs to extract features of normal sounds from log-mel spectrograms for ASD. However, these methods can not fully exploit temporal information of log-mel spectrograms, resulting in poor performance on some machine types. In this paper, we propose a new framework for ASD named Spectrogram-Wavegram WaveNet (SW-WaveNet), which segments the 2-D log-mel spectrogram into 1-D waveform signals of different frequency bands and combines the representation vector extracted by WaveNet from segmented log-mel spectrograms and Wavegrams, respectively. The proposed framework utilizes WaveNet's powerful capability of modeling waveform signals to effectively extract temporal information from log-mel spectrograms and Wavegrams. Experiments on the DCASE 2020 Challenge Task 2 dataset show that our framework achieves higher average AUC scores (93.25%) and pAUC scores (87.41%) than the previous works.