Talking Face Generation

37 papers with code • 2 benchmarks • 6 datasets

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Benchmarks

Add a Result

These leaderboards are used to track progress in Talking Face Generation

Trend	Dataset	Best Model	Paper	Code	Compare
	LRW	LipGAN			See all
	CREMA-D	EmoGen			See all

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

Rudrabha/Wav2Lip • • 23 Aug 2020

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

Paper
Code

MakeItTalk: Speaker-Aware Talking-Head Animation

yzhou359/MakeItTalk • • 27 Apr 2020

We present a method that generates expressive talking heads from a single facial image with audio as the only input.

Paper
Code

Talking Face Generation by Conditional Recurrent Adversarial Network

susanqq/Talking_Face_Generation • • 13 Apr 2018

Given an arbitrary face image and an arbitrary speech clip, the proposed work attempts to generating the talking face video with accurate lip synchronization while maintaining smooth transition of both lip and facial movement over the entire video clip.

Paper
Code

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

Hangz-nju-cuhk/Talking-Face-Generation-DAVS • • 20 Jul 2018

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

Paper
Code

ReenactGAN: Learning to Reenact Faces via Boundary Transfer

wywu/ReenactGAN • • ECCV 2018

A transformer is subsequently used to adapt the boundary of source face to the boundary of target face.

Paper
Code

Capture, Learning, and Synthesis of 3D Speaking Styles

TimoBolkart/voca • • CVPR 2019

To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers.

Paper
Code

Neural Voice Puppetry: Audio-driven Facial Reenactment

miu200521358/NeuralVoicePuppetryMMD • • ECCV 2020

Neural Voice Puppetry has a variety of use-cases, including audio-driven video avatars, video dubbing, and text-driven video synthesis of a talking head.

Paper
Code

Speech Driven Talking Face Generation from a Single Image and an Emotion Condition

eeskimez/emotalkingface • • 8 Aug 2020

Visual emotion expression plays an important role in audiovisual speech communication.

Paper
Code

Stochastic Talking Face Generation Using Latent Distribution Matching

ry85/Stochastic-Talking-Face-Generation-Using-Latent-Distribution-Matching • • 21 Nov 2020

Indeed, just having the ability to generate a single talking face would make a system almost robotic in nature.

Paper
Code

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis

YudongGuo/AD-NeRF • • ICCV 2021

Generating high-fidelity talking head video by fitting with the input audio sequence is a challenging problem that receives considerable attentions recently.

Paper
Code

Talking Face Generation

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result