Self-Transriber: Few-shot Lyrics Transcription with Self-training

Xiaoxue Gao (National University of Singapore); Xianghu Yue (National University of Singapore ); Haizhou Li (The Chinese University of Hong Kong, Shenzhen)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive. How to benefit from unlabeled data and alleviate limited data problem have not been explored for lyrics transcription. We propose the first semi-supervised lyrics transcription paradigm, Self-Transcriber, by leveraging on unlabeled data using self-training with noisy student augmentation. We attempt to demonstrate the possibility of lyrics transcription with a few amount of labeled data. Self-Transcriber generates pseudo labels of the unlabeled singing using teacher model, and augments pseudo-labels to the labeled data for student model update with both self-training and supervised training losses. This work closes the gap between supervised and semi-supervised learning as well as opens doors for few-shot learning of lyrics transcription. Our experiments show that our approach using only 12.7 hours of labeled data achieves competitive performance compared with the supervised approaches trained on 149.1 hours of labeled data for lyrics transcription.

Tags:

Music signal analysis, processing and synthesis

Self-Transriber: Few-shot Lyrics Transcription with Self-training

Xiaoxue Gao (National University of Singapore); Xianghu Yue (National University of Singapore ); Haizhou Li (The Chinese University of Hong Kong, Shenzhen)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Deep Self-Supervised Hierarchical Metrical Structure Modeling

Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond

Towards Controllable Audio Texture Morphing

Join the IEEE Signal Processing Society