Visual Speech Recognition

40 papers with code • 2 benchmarks • 5 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Visual Speech Recognition

Trend	Dataset	Best Model	Paper	Code	Compare
	LRS3-TED	CTC/Attention			See all
	LRS2	VTP with more data			See all

Datasets

Subtasks

Lip to Speech Synthesis

Latest papers

Most implemented Social Latest No code

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

dalision/modalbiasavsr • 7 Mar 2024

In this paper, we investigate this contrasting phenomenon from the perspective of modality bias and reveal that an excessive modality bias on the audio caused by dropout is the underlying reason.

07 Mar 2024

Paper
Code

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

sally-sh/vsp-llm • • 23 Feb 2024

In visual speech processing, context modeling capability is one of the most important requirements due to the ambiguous nature of lip movements.

271

23 Feb 2024

Paper
Code

The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023

mkt-dataoceanai/cnvsrc2023baseline • • 7 Jan 2024

This paper delineates the visual speech recognition (VSR) system introduced by the NPU-ASLP-LiAuto (Team 237) in the first Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) 2023, engaging in the fixed and open tracks of Single-Speaker VSR Task, and the open track of Multi-Speaker VSR Task.

07 Jan 2024

Paper
Code

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation

zqs01/multi-channel-wav2vec2 • • 7 Jan 2024

Considering that visual information helps to improve speech recognition performance in noisy scenes, in this work we propose a multichannel multi-modal speech self-supervised learning framework AV-wav2vec2, which utilizes video and multichannel audio data as inputs.

07 Jan 2024

Paper
Code

Do VSR Models Generalize Beyond LRS3?

yasserdahouml/vsr_test_set • • 23 Nov 2023

The Lip Reading Sentences-3 (LRS3) benchmark has primarily been the focus of intense research in visual speech recognition (VSR) during the last few years.

23 Nov 2023

Paper
Code

LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild

david-gimeno/lip-rtve • LREC 2022

Speech is considered as a multi-modal process where hearing and vision are two fundamentals pillars.

21 Nov 2023

Paper
Code

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

mispchallenge/misp-icme-avsr • • 14 Aug 2023

In this paper, we propose two novel techniques to improve audio-visual speech recognition (AVSR) under a pre-training and fine-tuning training framework.

14 Aug 2023

Paper
Code

MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition

yuchen005/mir-gan • • 18 Jun 2023

In this paper, we aim to learn the shared representations across modalities to bridge their gap.

18 Jun 2023

Paper
Code

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

yuchen005/univpm • • 18 Jun 2023

In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.

18 Jun 2023

Paper
Code

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

exgc/opensr • • 10 Jun 2023

We demonstrate that OpenSR enables modality transfer from one to any in three different settings (zero-, few- and full-shot), and achieves highly competitive zero-shot performance compared to the existing few-shot and full-shot lip-reading methods.

10 Jun 2023

Paper
Code

Visual Speech Recognition

Benchmarks Add a Result

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result