RANDMASKING AUGMENT: A SIMPLE AND RANDOMIZED DATA AUGMENTATION FOR ACOUSTIC SCENE CLASSIFICATION

JuBum Han (Samsung Research); Mateusz Matuszewski (Samsung R&D Institute Poland); Olaf Sikorski (Samsung R&D Poland); Hosang Sung (Samsung Research); Hoonyoung Cho (Samsung Research)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

In this work, we describe RandMasking Augment as an effective data augmentation method for acoustic scene classification research. We concentrate on both time and frequency domains masking augmentation introduced in SpecAugment, and apply various transformations that can maintain time and frequency information of the original spectrogram to the masking region. Because acoustic feature is transformed into various forms without distortion of frequency and time information, the proposed augmentation can capture unique characteristics of the input audio in detail. Moreover, RandMasking Augment can be extended by mixing other audio samples and applying different weights on frequency bands in the randomized masking region. We evaluate the suggested augmentation on the DCASE 2018 Task1A dataset and the DCASE 2019 Task1A dataset, and it is compared with other augmentation methods. The proposed augmentation shows outstanding performances with various popular convolutional neural networks.

Tags:

Detection and classification of acoustic scenes and events

RANDMASKING AUGMENT: A SIMPLE AND RANDOMIZED DATA AUGMENTATION FOR ACOUSTIC SCENE CLASSIFICATION

JuBum Han (Samsung Research); Mateusz Matuszewski (Samsung R&D Institute Poland); Olaf Sikorski (Samsung R&D Poland); Hosang Sung (Samsung Research); Hoonyoung Cho (Samsung Research)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

GraphIT: Iterative reweighted l1 algorithm for sparse graph inference in state-space models

Training sound event detection with soft labels from crowdsourced annotations

HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones

Join the IEEE Signal Processing Society