Text-to-Music Generation
13 papers with code • 2 benchmarks • 3 datasets
Most implemented papers
Mustango: Toward Controllable Text-to-Music Generation
Through extensive experiments, we show that the quality of the music generated by Mustango is state-of-the-art, and the controllability through music-specific text prompts greatly outperforms other models such as MusicGen and AudioLDM2.
The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation
We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models.
PAM: Prompting Audio-Language Models for Audio Quality Assessment
Here, we exploit this capability and introduce PAM, a no-reference metric for assessing audio quality for different audio processing tasks.