Talking Face Generation

37 papers with code • 2 benchmarks • 6 datasets

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Benchmarks

Add a Result

These leaderboards are used to track progress in Talking Face Generation

Trend	Dataset	Best Model	Paper	Code	Compare
	LRW	LipGAN			See all
	CREMA-D	EmoGen			See all

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

Deepfake Generation and Detection: A Benchmark and Survey

flyingby/awesome-deepfake-generation-and-detection • 26 Mar 2024

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

26 Mar 2024

Paper
Code

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

yerfor/Real3DPortrait • • 16 Jan 2024

One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video.

586

16 Jan 2024

Paper
Code

Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism

g-milis/NEUTART • • 11 Dec 2023

Our method, which we call NEUral Text to ARticulate Talk (NEUTART), is a talking face generator that uses a joint audiovisual feature space, as well as speech-informed 3D facial reconstructions and a lip-reading loss for visual supervision.

11 Dec 2023

Paper
Code

SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis

ZiqiaoPeng/SyncTalk • • 29 Nov 2023

A lifelike talking head requires synchronized coordination of subject identity, lip movements, facial expressions, and head poses.

782

29 Nov 2023

Paper
Code

HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation

semchan/HyperLips • • 9 Oct 2023

First, FaceEncoder is used to obtain latent code by extracting features from the visual face information taken from the video source containing the face frame. Then, HyperConv, which weighting parameters are updated by HyperNet with the audio features as input, will modify the latent code to synchronize the lip movement with the audio.

145

09 Oct 2023

Paper
Code

HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods

yylgoodlucky/hdtr • • 14 Sep 2023

In particular, we propose a Fine-Grained Feature Fusion (FGFF) module to effectively capture fine texture feature information around teeth and surrounding regions, and use these features to fine-grain the feature map to enhance the clarity of teeth.

106

14 Sep 2023

Paper
Code

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

Weizhi-Zhong/IP_LAP • • CVPR 2023

Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker.

554

15 May 2023

Paper
Code

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

sxjdwang/talklip • • CVPR 2023

To address the problem, we propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing the incorrect generation results.

350

29 Mar 2023

Paper
Code

Emotionally Enhanced Talking Face Generation

sahilg06/EmoGen • • 21 Mar 2023

To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.

316

21 Mar 2023

Paper
Code

DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video

MRzzm/DINet • • 7 Mar 2023

Different from previous works relying on multiple up-sample layers to directly generate pixels from latent embeddings, DINet performs spatial deformation on feature maps of reference images to better preserve high-frequency textural details.

833

07 Mar 2023

Paper
Code

Talking Face Generation

Benchmarks Add a Result

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result