The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the cocktail party effect from an augmented-reality (AR) -motivated multi-sensor egocentric world view. The dataset contains AR glasses egocentric multi-channel microphone array audio, wide field-of-view RGB video, speech source pose, headset microphone audio, annotated voice activity, speech transcriptions, head and face bounding boxes and source identification labels. We have created and are releasing this dataset to facilitate research in multi-modal AR solutions to the cocktail party problem.
15 PAPERS • 4 BENCHMARKS
The Tongue and Lips (TaL) corpus is a multi-speaker corpus of ultrasound images of the tongue and video images of lips. This corpus contains synchronised imaging data of extraoral (lips) and intraoral (tongue) articulators from 82 native speakers of English.
3 PAPERS • NO BENCHMARKS YET