SIMILARITY LEARNING FOR COVER SONG IDENTIFICATION USING CROSS-SIMILARITY MATRICES OF MULTI-LEVEL DEEP SEQUENCES

14 May 2020  ·  Chaoya Jiang, Deshun Yang, Xiaoou Chen ·

In recent years, several deep learning models have been proposed for cover song identification and they have been designed to learn fixed-length feature vectors for music tracks. However, the aspect of temporal progression of music, which is important for measuring the melody similarity between two tracks, is not well represented by fixed-length vectors. In this paper, we propose a new Siamese network architecture for music melody similarity metric learning. The architecture consists of two parts. One part is a network for learn- ing the deep sequence representation of music tracks, and the other is a similarity estimation network which takes as input the cross- similarity matrices calculated from the deep sequences of a pair of tracks. The two networks are jointly trained and optimized to achieve high melody similarity prediction accuracy. Experiments conducted on several public datasets demonstrate the superiority of the proposed architecture.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Cover song identification Covers80 SCNN-M MAP 0.909 # 2
Cover song identification SHS100K-TEST SCNN-M mAP 0.740 # 3
Cover song identification YouTube350 SCNN-M MAP 0.961 # 1

Methods