TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Motion Synthesis	HumanML3D	DiverseMotion (s=1)	FID	0.070	# 3
Motion Synthesis	HumanML3D	DiverseMotion (s=1)	Diversity	9.551	# 14
Motion Synthesis	HumanML3D	DiverseMotion (s=1)	Multimodality	2.062	# 10
Motion Synthesis	HumanML3D	DiverseMotion (s=1)	R Precision Top3	0.783	# 12
Motion Synthesis	HumanML3D	DiverseMotion (s=2)	FID	0.072	# 4
Motion Synthesis	HumanML3D	DiverseMotion (s=2)	Diversity	9.683	# 9
Motion Synthesis	HumanML3D	DiverseMotion (s=2)	Multimodality	1.869	# 11
Motion Synthesis	HumanML3D	DiverseMotion (s=2)	R Precision Top3	0.802	# 3
Motion Synthesis	KIT Motion-Language	DiverseMotion	FID	0.468	# 10
Motion Synthesis	KIT Motion-Language	DiverseMotion	R Precision Top3	0.760	# 6
Motion Synthesis	KIT Motion-Language	DiverseMotion	Diversity	10.873	# 10
Motion Synthesis	KIT Motion-Language	DiverseMotion	Multimodality	2.062	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/diversemotion-towards-diverse-human-motion/motion-synthesis-on-humanml3d)](https://paperswithcode.com/sota/motion-synthesis-on-humanml3d?p=diversemotion-towards-diverse-human-motion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/diversemotion-towards-diverse-human-motion/motion-synthesis-on-kit-motion-language)](https://paperswithcode.com/sota/motion-synthesis-on-kit-motion-language?p=diversemotion-towards-diverse-human-motion)`

DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion

4 Sep 2023 · Yunhong Lou, Linchao Zhu, Yaxiong Wang, Xiaohan Wang, Yi Yang ·

We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions while preserving motion diversity.Despite the recent significant process in text-based human motion generation,existing methods often prioritize fitting training motions at the expense of action diversity. Consequently, striking a balance between motion quality and diversity remains an unresolved challenge. This problem is compounded by two key factors: 1) the lack of diversity in motion-caption pairs in existing benchmarks and 2) the unilateral and biased semantic understanding of the text prompt, focusing primarily on the verb component while neglecting the nuanced distinctions indicated by other words.In response to the first issue, we construct a large-scale Wild Motion-Caption dataset (WMC) to extend the restricted action boundary of existing well-annotated datasets, enabling the learning of diverse motions through a more extensive range of actions. To this end, a motion BLIP is trained upon a pretrained vision-language model, then we automatically generate diverse motion captions for the collected motion sequences. As a result, we finally build a dataset comprising 8,888 motions coupled with 141k text.To comprehensively understand the text command, we propose a Hierarchical Semantic Aggregation (HSA) module to capture the fine-grained semantics.Finally,we involve the above two designs into an effective Motion Discrete Diffusion (MDD) framework to strike a balance between motion quality and diversity. Extensive experiments on HumanML3D and KIT-ML show that our DiverseMotion achieves the state-of-the-art motion quality and competitive motion diversity. Dataset, code, and pretrained models will be released to reproduce all of our results.

PDF Abstract

Code

Add Remove Mark official

axdfhj/mdd official

Tasks

Add Remove

Language Modelling

Motion Synthesis

Datasets

Human3.6M

HumanML3D

RICH KIT Motion-Language

Results from the Paper

Edit

Ranked #3 on Motion Synthesis on HumanML3D (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Motion Synthesis	HumanML3D	DiverseMotion (s=1)	FID	0.070	# 3	Compare
			Diversity	9.551	# 14	Compare
			Multimodality	2.062	# 10	Compare
			R Precision Top3	0.783	# 12	Compare
Motion Synthesis	HumanML3D	DiverseMotion (s=2)	FID	0.072	# 4	Compare
			Diversity	9.683	# 9	Compare
			Multimodality	1.869	# 11	Compare
			R Precision Top3	0.802	# 3	Compare
Motion Synthesis	KIT Motion-Language	DiverseMotion	FID	0.468	# 10	Compare
			R Precision Top3	0.760	# 6	Compare
			Diversity	10.873	# 10	Compare
			Multimodality	2.062	# 7	Compare

Methods

Add Remove

BLIP • Diffusion

Edit Social Preview

DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove