The MPII Human Pose Descriptions dataset extends the widely-used MPII Human Pose Dataset with rich textual annotations. These annotations are generated by various state-of-the-art language models (LLMs) and include detailed descriptions of the activities being performed, the count of people present, and their specific poses.

The dataset consists of the same image splits as provided in MMPose, with 14644 training samples and 2723 validation samples. Each image is accompanied by one or more pose descriptions generated by different LLMs. The descriptions are also accompanied by additional annotation information, including the activity type, people count, and pose keypoints, which are derived from the original MPII Human Pose Dataset annotations.

By adding textual annotations to the existing human pose dataset, this extended version supports novel research in multi-modal learning, where both visual and textual cues can be explored.


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.


Similar Datasets

