Singing Melody Extraction From Polyphonic Music Based On Spectral Correlation Modeling

Xingjian Du, Bilei Zhu, Qiuqiang Kong, Zejun Ma

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:06:11

09 Jun 2021

Convolutional neural network (CNN) based methods have achieved state-of-the-art performance for singing melody extraction from polyphonic music. However, most of these methods focus on the learning of local features, while relationships among spectral components locating far apart are often neglected. In this paper, we explore the idea of modeling spectral correlation explicitly for melody extraction. Specifically, we present a spectral correlation module (SCM) that can learn to model the relationships among all frequency bands in a time-frequency representation, thus allowing the encoding of global spectral information into a conventional CNN. Furthermore, we propose to integrate center frequencies with the input feature map of SCM to improve the performance. We implement a light-weight model comprised of SCM blocks to verify the efficacy of our system. Our system achieves a state-of-the-art overall accuracy of 83.5% on the MedleyDB dataset.

Chairs:

Helene Crayencourt

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021