MusicCaps is a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts. For each 10-second music clip, MusicCaps provides:
44 PAPERS • 1 BENCHMARK
The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!
1 PAPER • 1 BENCHMARK
Dataset Summary The dataset used to train and evaluate TunesFormer is collected from two sources: The Session and ABCnotation.com. The Session is a community website focused on Irish traditional music, while ABCnotation.com is a website that provides a standard for folk and traditional music notation in the form of ASCII text files. The combined dataset consists of 285,449 ABC tunes, with 99\% (282,595) of the tunes used as the training set and the remaining 1\% (2854) used as the evaluation set.
1 PAPER • NO BENCHMARKS YET