LRW (Lip Reading in the Wild)

Introduced by Joon Son Chung et al. in Lip Reading in the Wild

The Lip Reading in the Wild (LRW) dataset a large-scale audio-visual database that contains 500 different words from over 1,000 speakers. Each utterance has 29 frames, whose boundary is centered around the target word. The database is divided into training, validation and test sets. The training set contains at least 800 utterances for each class while the validation and test sets contain 50 utterances.

Source: Towards Pose-invariant Lip-Reading

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Lipreading	Lip Reading in the Wild	3D Conv + ResNet-18 + DC-TCN + KD
Unconstrained Lip-synchronization	LRW	Wav2Lip + GAN
Talking Face Generation	LRW	LipGAN
Visual Keyword Spotting	LRW	Transpotter
Audio-Visual Speech Recognition	LRW	2DCNN + BiLSTM + ResNet + MLF
Lip Reading	LRW	Lip2Wav
Lip to Speech Synthesis	LRW	Lip2Wav