Skip to main content

Semantically-informed Deep Neural Networks for sound recognition

Michele Esposito (Maastricht University); Giancarlo Valente (Maastricht University); Yenisel Plasencia-Calaña (Maastricht University); Michel Dumontier (Maastricht University); Bruno L. Giordano (CNRS); Elia Formisano (Maastricht University)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Deep neural networks (DNNs) for sound recognition learn to categorize a barking sound as a "dog" and a meowing sound as a "cat" but do not exploit information inherent to the semantic relations between classes (e.g., both are animal vocalisations). Cognitive neuroscience research, however, suggests that human listeners automatically exploit higher-level semantic information on the sources besides acoustic information. Inspired by this notion, we introduce here a DNN that learns to recognize sounds and simultaneously learns the semantic relation between the sources (semDNN). Comparison of semDNN with a homologous network trained with categorical labels (catDNN) revealed that semDNN produces semantically more accurate labelling than catDNN in sound recognition tasks and that semDNN-embeddings preserve higher-level semantic relations between sound sources. Importantly, through a model-based analysis of human dissimilarity ratings of natural sounds, we show that semDNN approximates the behaviour of human listeners better than catDNN and several other DNN and NLP comparison models.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00