Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

Jian Guan (Harbin Engineering University); Youde Liu ( Harbin Institute of Technology); Qiaoxi Zhu (University of Technology Sydney); 铁然郑 (哈尔滨工业大学 ); jiqing Han (Harbin Institute of Technology); Wenwu Wang (University of Surrey)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Although deep learning is the mainstream method in unsupervised anomalous sound detection, Gaussian Mixture Model (GMM) with statistical audio frequency representation as input can achieve comparable results with much lower model complexity and fewer parameters. Existing statistical frequency representations, e.g. the log-Mel spectrogram's average or maximum over time, do not always work well for different machines. This paper presents Time-Weighted Frequency Domain Representation (TWFR) with the GMM method (TWFR-GMM) for anomalous sound detection. The TWFR is a generalized statistical frequency domain representation that can adapt to different machine types, using the global weighted ranking pooling over time-domain. This allows GMM estimator to recognize anomalies, even under domain-shift conditions, as visualized with a Mahalanobis distance-based metric. Experiments on DCASE 2022 Challenge Task2 dataset show that our method has better detection performance than recent deep learning methods. TWFR-GMM is the core of our submission that achieved the 3rd place in DCASE 2022 Challenge Task2.

Tags:

Detection and classification of acoustic scenes and events

Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

Jian Guan (Harbin Engineering University); Youde Liu ( Harbin Institute of Technology); Qiaoxi Zhu (University of Technology Sydney); 铁然 郑 (哈尔滨工业大学 ); jiqing Han (Harbin Institute of Technology); Wenwu Wang (University of Surrey)

Join the IEEE Signal Processing Society

Jian Guan (Harbin Engineering University); Youde Liu ( Harbin Institute of Technology); Qiaoxi Zhu (University of Technology Sydney); 铁然郑 (哈尔滨工业大学 ); jiqing Han (Harbin Institute of Technology); Wenwu Wang (University of Surrey)