TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Motion Synthesis	HumanML3D	ReMoDiffuse	FID	0.103	# 7
Motion Synthesis	HumanML3D	ReMoDiffuse	Diversity	9.018	# 20
Motion Synthesis	HumanML3D	ReMoDiffuse	Multimodality	1.795	# 14
Motion Synthesis	HumanML3D	ReMoDiffuse	R Precision Top3	0.795	# 5
Motion Synthesis	KIT Motion-Language	ReMoDiffuse	FID	0.155	# 1
Motion Synthesis	KIT Motion-Language	ReMoDiffuse	R Precision Top3	0.765	# 5
Motion Synthesis	KIT Motion-Language	ReMoDiffuse	Diversity	10.80	# 15
Motion Synthesis	KIT Motion-Language	ReMoDiffuse	Multimodality	1.239	# 14

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/remodiffuse-retrieval-augmented-motion/motion-synthesis-on-kit-motion-language)](https://paperswithcode.com/sota/motion-synthesis-on-kit-motion-language?p=remodiffuse-retrieval-augmented-motion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/remodiffuse-retrieval-augmented-motion/motion-synthesis-on-humanml3d)](https://paperswithcode.com/sota/motion-synthesis-on-humanml3d?p=remodiffuse-retrieval-augmented-motion)`

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

ICCV 2023 · Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu ·

3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the performance on more diverse motions remains unsatisfactory. In this work, we propose ReMoDiffuse, a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process. ReMoDiffuse enhances the generalizability and diversity of text-driven motion generation with three key designs: 1) Hybrid Retrieval finds appropriate references from the database in terms of both semantic and kinematic similarities. 2) Semantic-Modulated Transformer selectively absorbs retrieval knowledge, adapting to the difference between retrieved samples and the target motion sequence. 3) Condition Mixture better utilizes the retrieval database during inference, overcoming the scale sensitivity in classifier-free guidance. Extensive experiments demonstrate that ReMoDiffuse outperforms state-of-the-art methods by balancing both text-motion consistency and motion quality, especially for more diverse motion generation.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

mingyuan-zhang/ReMoDiffuse official

↳ Quickstart in

Colab

Spaces

292

Tasks

Add Remove

Denoising

Motion Synthesis

Retrieval

Datasets

HumanML3D KIT Motion-Language

Results from the Paper

Edit

Ranked #1 on Motion Synthesis on KIT Motion-Language

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Motion Synthesis	HumanML3D	ReMoDiffuse	FID	0.103	# 7	Compare
			Diversity	9.018	# 20	Compare
			Multimodality	1.795	# 14	Compare
			R Precision Top3	0.795	# 5	Compare
Motion Synthesis	KIT Motion-Language	ReMoDiffuse	FID	0.155	# 1	Compare
			R Precision Top3	0.765	# 5	Compare
			Diversity	10.80	# 15	Compare
			Multimodality	1.239	# 14	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove