Similarity Learning For Cover Song Identification Using Cross-Similarity Matrices Of Multi-Level Deep Sequences
Chaoya Jiang, Deshun Yang, Xiaoou Chen
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 12:11
In recent years, several deep learning models have been proposed for cover song identification and they have been designed to learn fixed-length feature vectors for music tracks. However, the aspect of temporal progression of music, which is important for measuring the melody similarity between two tracks, is not well represented by fixed-length vectors. In this paper, we propose a new Siamese network architecture for music melody similarity metric learning. The architecture consists of two parts. One part is a network for learning the deep sequence representation of music tracks, and the other is a similarity estimation network which takes as input the crosssimilarity matrices calculated from the deep sequences of a pair of tracks. The two networks are jointly trained and optimized to achieve high melody similarity prediction accuracy. Experiments conducted on several public datasets demonstrate the superiority of the proposed architecture.