Gesture Generation
34 papers with code • 4 benchmarks • 6 datasets
Generation of gestures, as a sequence of 3d poses
Libraries
Use these libraries to find Gesture Generation models and implementationsMost implemented papers
Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows
In interactive scenarios, systems for generating natural animations on the fly are key to achieving believable and relatable characters.
Moving fast and slow: Analysis of representations and post-processing in speech-driven automatic gesture generation
We provide an analysis of different representations for the input (speech) and the output (motion) of the network by both objective and subjective evaluations.
Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach
A key challenge, called gesture style transfer, is to learn a model that generates these gestures for a speaking agent 'A' in the gesturing style of a target speaker 'B'.
No Gestures Left Behind: Learning Relationships between Spoken Language and Freeform Gestures
We study relationships between spoken language and co-speech gestures in context of two key challenges.
DeepNAG: Deep Non-Adversarial Gesture Generation
We find that DeepNAG outperforms DeepGAN in accuracy, training time (up to 17x faster), and realism, thereby opening the door to a new line of research in generator network design and training for gesture synthesis.
A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents
To date, recent end-to-end gesture generation methods have not been evaluated in a real-time interaction with users.
Probabilistic Human-like Gesture Synthesis from Speech using GRU-based WGAN
The motions synthesized by the proposed system are evaluated via an objective measure and a subjective experiment, showing that the proposed model outperforms a baseline model which is trained by a state-of-the-art GAN-based algorithm, using the same dataset.
Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning
Our network consists of two components: a generator to synthesize gestures from a joint embedding space of features encoded from the input speech and the seed poses, and a discriminator to distinguish between the synthesized pose sequences and real 3D pose sequences.
Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Co-speech gesture generation is to synthesize a gesture sequence that not only looks real but also matches with the input speech audio.
Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation
We propose a vector-quantized variational autoencoder structure as well as training techniques to learn a rigorous representation of gesture sequences.