4 dataset results for Speech-to-Speech Translation AND Speech

CVSS is a massively multilingual-to-English speech to speech translation (S2ST) corpus, covering sentence-level parallel S2ST pairs from 21 languages into English. CVSS is derived from the Common Voice speech corpus and the CoVoST 2 speech-to-text translation (ST) corpus, by synthesizing the translation text from CoVoST 2 into speech using state-of-the-art TTS systems

18 PAPERS • 1 BENCHMARK

SpeechMatrix

SpeechMatrix is a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. It contains speech alignments in 136 language pairs with a total of 418 thousand hours of speech.

5 PAPERS • NO BENCHMARKS YET

LibriS2S

LibriS2S is a Speech to Speech Translation (S2ST) dataset build further upon existing resources. The dataset provides English-German speech and text quadruplets ranging just over 50 hours for both languages.

1 PAPER • NO BENCHMARKS YET

TAT (Taiwanese Across Taiwan)

Taiwanese Across Taiwan (TAT) corpus is a Large-Scale database of Native Taiwanese Article/Reading Speech collected across Taiwan. This corpus contains native Taiwanese speech of various accent across Taiwan. The corpus is annotated twice for use in voice recognition research. The corpus contains recording from 100 native speakers, each with length of 30 minutes making a total of 100 hours of speech data.

1 PAPER • 1 BENCHMARK

Datasets

4 dataset results for Speech-to-Speech Translation AND Speech