Talking Face Generation

37 papers with code • 2 benchmarks • 6 datasets

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Deepfake Generation and Detection: A Benchmark and Survey

flyingby/awesome-deepfake-generation-and-detection 26 Mar 2024

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

81
26 Mar 2024

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

yerfor/Real3DPortrait 16 Jan 2024

One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video.

586
16 Jan 2024

Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism

g-milis/NEUTART 11 Dec 2023

Our method, which we call NEUral Text to ARticulate Talk (NEUTART), is a talking face generator that uses a joint audiovisual feature space, as well as speech-informed 3D facial reconstructions and a lip-reading loss for visual supervision.

22
11 Dec 2023

SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis

ZiqiaoPeng/SyncTalk 29 Nov 2023

A lifelike talking head requires synchronized coordination of subject identity, lip movements, facial expressions, and head poses.

782
29 Nov 2023

HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation

semchan/HyperLips 9 Oct 2023

First, FaceEncoder is used to obtain latent code by extracting features from the visual face information taken from the video source containing the face frame. Then, HyperConv, which weighting parameters are updated by HyperNet with the audio features as input, will modify the latent code to synchronize the lip movement with the audio.

145
09 Oct 2023

HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods

yylgoodlucky/hdtr 14 Sep 2023

In particular, we propose a Fine-Grained Feature Fusion (FGFF) module to effectively capture fine texture feature information around teeth and surrounding regions, and use these features to fine-grain the feature map to enhance the clarity of teeth.

106
14 Sep 2023

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

Weizhi-Zhong/IP_LAP CVPR 2023

Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker.

554
15 May 2023

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

sxjdwang/talklip CVPR 2023

To address the problem, we propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing the incorrect generation results.

350
29 Mar 2023

Emotionally Enhanced Talking Face Generation

sahilg06/EmoGen 21 Mar 2023

To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.

316
21 Mar 2023

DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video

MRzzm/DINet 7 Mar 2023

Different from previous works relying on multiple up-sample layers to directly generate pixels from latent embeddings, DINet performs spatial deformation on feature maps of reference images to better preserve high-frequency textural details.

833
07 Mar 2023