CVSS is a massively multilingual-to-English speech to speech translation (S2ST) corpus, covering sentence-level parallel S2ST pairs from 21 languages into English. CVSS is derived from the Common Voice speech corpus and the CoVoST 2 speech-to-text translation (ST) corpus, by synthesizing the translation text from CoVoST 2 into speech using state-of-the-art TTS systems
18 PAPERS • 1 BENCHMARK
LibriS2S is a Speech to Speech Translation (S2ST) dataset build further upon existing resources. The dataset provides English-German speech and text quadruplets ranging just over 50 hours for both languages.
1 PAPER • NO BENCHMARKS YET