2 dataset results for Audio Generation AND Texts

AudioCaps is a dataset of sounds with event descriptions that was introduced for the task of audio captioning, with sounds sourced from the AudioSet dataset. Annotators were provided the audio tracks together with category hints (and with additional video hints if needed).

176 PAPERS • 10 BENCHMARKS

Audio-alpaca

Audio-alpaca: A preference dataset for aligning text-to-audio models Audio-alpaca is a pairwise preference dataset containing about 15k (prompt,chosen, rejected) triplets where given a textual prompt, chosen is the preferred generated audio and rejected is the undesirable audio.

1 PAPER • NO BENCHMARKS YET

Datasets

2 dataset results for Audio Generation AND Texts