On The Preparation And Validation Of A Large-Scale Dataset Of Singing Transcription
Jun-You Wang, Jyh-Shing Roger Jang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:32
This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria and ask non-experts to label notes. We also perform some adjustments on the annotation to correct minor errors. Finally, to validate the dataset, we train a singing transcription model on MIR-ST500 dataset and evaluate it on various datasets. The result shows that we can certainly construct a better singing transcription model for various purposes using MIR-ST500, which is properly labeled and validated.
Chairs:
Johanna Devaney