CSL-Daily (Chinese Sign Language Corpus) is a large-scale continuous SLT dataset. It provides both spoken language translations and gloss-level annotations. The topic revolves around people's daily lives (e.g., travel, shopping, medical care), the most likely SLT application scenario.
43 PAPERS • 4 BENCHMARKS
The How2Sign is a multimodal and multiview continuous American Sign Language (ASL) dataset consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth. A three-hour subset was further recorded in the Panoptic studio enabling detailed 3D pose estimation.
31 PAPERS • 3 BENCHMARKS
Large-scale American Sign Language (ASL) - English dataset collected from online video sites (e.g., YouTube). OpenASL contains 288 hours of ASL videos in multiple domains from over 200 signers.
9 PAPERS • NO BENCHMARKS YET
An artificial corpus built using grammatical dependencies rules due to the lack of resources for Sign Language.
1 PAPER • 1 BENCHMARK
LSA-T is the first continuous Argentinian Sign Language (LSA) dataset. It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer. Videos are in 30 FPS full HD (1920x1080).
Sign Language Datasets for French Belgian Sign Language This dataset is built upon the work of Belgian linguists from the University of Namur. During eight years, they've collected and annotated 50 hours of videos depicting sign language conversation. 100 signers were recorded, making it one of the most representative sign language corpus. The annotation has been sanitized and enriched with metadata to construct two, easy to use, datasets for sign language recognition. One for continuous sign language recognition and the other for isolated sign recognition.
1 PAPER • NO BENCHMARKS YET
We provide separate training, development and test data. The training data is available right away. The development and test data will be released in several stages, starting with a release of the development sources only.
1 PAPER • 2 BENCHMARKS