2 dataset results for Audio Synthesis

Trinity Gesture Dataset includes 23 takes, totalling 244 minutes of motion capture and audio of a male native English speaker producing spontaneous speech on different topics. The actor’s motion was captured with 20 Viconcameras at 59.94 frames per second(fps), and the skeleton includes 69 joints.

2 PAPERS • 3 BENCHMARKS

mDRT

mDRT (Multilingual Diagnostic Rhyme Test)

We present a multilingual test set for conducting speech intelligibility tests in the form of diagnostic rhyme tests. The materials currently contain audio recordings in 5 languages and further extensions are in progress. For Mandarin Chinese, we provide recordings for a consonant contrast test as well as a tonal contrast test. Further information on the audio data, test procedure and software to set up a full survey which can be deployed on crowdsourcing platforms is provided in our paper [arXiv preprint] and GitHub repository. We welcome contributions to this open-source project.

1 PAPER • NO BENCHMARKS YET

Datasets

2 dataset results for Audio Synthesis