Skip to main content

RAT: Radial Attention Transformer for Singing Technique Recognition

Guan-Yuan Chen (National Tsing Hua University); Ya-Fen Yeh (National Tsing Hua University); Von-Wun Soo (nthu)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Singing techniques are important skills for a professional vocal performance that usually involves dedicated fluctuations of timbre, pitch, duration, and loudness, etc. To recognize types of singing techniques can be quite challenging because 1) the time-frequency features in singing are highly dynamic that may appear in a long range of audio signals; 2) different singing techniques such as vibrato and trill tend to have similar features in the locality; 3) The distribution of singing technique dataset suffers from the long-tailed issue. To manage these problems, we proposed a novel Radial Attention Transformer (RAT) with a Radial Attention (RA) Module that can capture the fine-grained local features as well as the long range inter-dependency of audio features. The experiment results showed that the proposed method, RAT with Adaptive Logit Adjustment (ALA) Loss significantly outperformed previous state-of-the-art models (Convolutional Neural Networks and Deformable CNN), on the recognition tasks of singing technique categories.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00