Talking Head Generation
40 papers with code • 7 benchmarks • 3 datasets
Talking head generation is the task of generating a talking face from a set of images of a person.
( Image credit: Few-Shot Adversarial Learning of Realistic Neural Talking Head Models )
Latest papers
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation
AToM excels in capturing subtle lip movements by leveraging an audio attention mechanism.
Adaptive Super Resolution For One-Shot Talking-Head Generation
In this work, we propose an adaptive high-quality talking-head video generation method, which synthesizes high-resolution video without additional pre-trained modules.
A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos
However, performance evaluation research lags behind the development of talking head generation techniques.
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis
A lifelike talking head requires synchronized coordination of subject identity, lip movements, facial expressions, and head poses.
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
Audio-driven talking-head synthesis is a popular research topic for virtual human-related applications.
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
In the second stage, an audio-driven talking head generation method is employed to produce compelling videos privided the audio generated in the first stage.
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
Talking head video generation aims to animate a human face in a still image with dynamic poses and expressions using motion information derived from a target-driving video, while maintaining the person's identity in the source image.
A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation
Animating still face images with deep generative models using a speech input signal is an active research topic and has seen important recent progress.
Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation
This paper presents a novel approach for generating 3D talking heads from raw audio inputs.
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars
It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees.