Talking Head Generation
40 papers with code • 7 benchmarks • 3 datasets
Talking head generation is the task of generating a talking face from a set of images of a person.
( Image credit: Few-Shot Adversarial Learning of Realistic Neural Talking Head Models )
Latest papers
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
In this work, firstly, we present a novel self-supervised method for learning dense 3D facial geometry (ie, depth) from face videos, without requiring camera parameters and 3D geometry annotations in training.
Face Animation with an Attribute-Guided Diffusion Model
Face animation has achieved much progress in computer vision.
Emotionally Enhanced Talking Face Generation
To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.
DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions
We enhance the efficiency of DisCoHead by integrating a dense motion estimator and the encoder of a generator which are originally separate modules.
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
In this way, the proposed DiffTalk is capable of producing high-quality talking head videos in synchronization with the source audio, and more importantly, it can be naturally generalized across different identities without any further fine-tuning.
StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles
In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
In this work, we propose an ID-preserving talking head generation framework, which advances previous methods in two aspects.
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.
Autoregressive GAN for Semantic Unconditional Head Motion Generation
In this work, we address the task of unconditional head motion generation to animate still human faces in a low-dimensional semantic space from a single reference pose.
Compressing Video Calls using Synthetic Talking Heads
We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver.