7 dataset results for Sign Language Translation AND Texts

CSL-Daily (Chinese Sign Language Corpus) is a large-scale continuous SLT dataset. It provides both spoken language translations and gloss-level annotations. The topic revolves around people's daily lives (e.g., travel, shopping, medical care), the most likely SLT application scenario.

43 PAPERS • 4 BENCHMARKS

How2Sign (A Large-scale Multimodal Dataset for Continuous American Sign Language)

The How2Sign is a multimodal and multiview continuous American Sign Language (ASL) dataset consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth. A three-hour subset was further recorded in the Panoptic studio enabling detailed 3D pose estimation.

31 PAPERS • 3 BENCHMARKS

OpenASL

Large-scale American Sign Language (ASL) - English dataset collected from online video sites (e.g., YouTube). OpenASL contains 288 hours of ASL videos in multiple domains from over 200 signers.

9 PAPERS • NO BENCHMARKS YET

ASLG-PC12 (English-ASL Gloss Parallel Corpus 2012)

An artificial corpus built using grammatical dependencies rules due to the lack of resources for Sign Language.

1 PAPER • 1 BENCHMARK

LSA-T (Lengua de Señas Argentina - Traducción)

LSA-T is the first continuous Argentinian Sign Language (LSA) dataset. It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer. Videos are in 30 FPS full HD (1920x1080).

1 PAPER • 1 BENCHMARK

LSFB Datasets (French Belgian Sign Language Datasets)

Sign Language Datasets for French Belgian Sign Language This dataset is built upon the work of Belgian linguists from the University of Namur. During eight years, they've collected and annotated 50 hours of videos depicting sign language conversation. 100 signers were recorded, making it one of the most representative sign language corpus. The annotation has been sanitized and enriched with metadata to construct two, easy to use, datasets for sign language recognition. One for continuous sign language recognition and the other for isolated sign recognition.

1 PAPER • NO BENCHMARKS YET

WMT-SLT

We provide separate training, development and test data. The training data is available right away. The development and test data will be released in several stages, starting with a release of the development sources only.

1 PAPER • 2 BENCHMARKS