The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!
Dataset Details MusicBench expands MusicCaps by:
Including music features of chords, beats, tempo, and key that are extracted from the audio. Describing these music features using text templates and thus enhancing the original text prompts. Expanding the number of audio samples by performing musically meaningful augmentations: semitone pitch shifts, tempo changes, and volume changes.
Train set size = 52,768 samples Test set size = 400
This dataset also includes FMACaps, which was used as a second test set.
Paper | Code | Results | Date | Stars |
---|