Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker and Gain
Karn N Watcharasupat (Georgia Institute of Technology); Kenneth Ooi (Nanyang Technological University); Bhan Lam (NTU); Trevor Wong (Nanyang Technological University); Zhen-Ting Ong (Nanyang Technological University); Woon Seng Gan (NTU )
-
SPS
IEEE Members: $11.00
Non-members: $15.00
The selection of maskers and playback gain levels in an in-situ soundscape augmentation system is crucial to its effectiveness in improving the overall acoustic comfort of a given environment. Traditionally, the selection of appropriate maskers and gain levels has been informed by expert opinion, which may not be representative of the target population, or by listening tests, which can be time- and labor-intensive. Furthermore, the resulting static choices of masker and gain are often inflexible to dynamic real-world soundscapes. In this work, we utilized a deep learning model to perform joint selection of the optimal masker and its gain level for a given soundscape. The proposed model was designed with highly modular building blocks, allowing for an optimized inference process that can quickly search through a large number of masker-gain combinations. In addition, we introduced the use of feature-domain soundscape augmentation conditioned on the digital gain level, eliminating the computationally expensive waveform-domain mixing process during inference, as well as the tedious gain adjustment process required for new maskers. The proposed system was evaluated on a large-scale dataset of subjective responses to augmented soundscapes with 442 participants, with the best model achieving a mean squared error of 0.122±0.005 on pleasantness score, validating the ability of the model to predict combined effect of the masker and its gain level on the perceptual pleasantness level. The proposed system thus allows in-situ or mixed-reality soundscape augmentation to be performed autonomously with near real-time latency while continuously accounting for changes in acoustic environments.