TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Question Answering (3D-QA)	3D MM-Vet	Point-Bind & Point-LLM	Overall Accuracy	23.5	# 5
Generative 3D Object Classification	Objaverse	Point-Bind LLM	Objaverse (I)	6.00	# 1
Generative 3D Object Classification	Objaverse	Point-Bind LLM	Objaverse (Average)	5.25	# 6
Generative 3D Object Classification	Objaverse	Point-Bind LLM	Objaverse (C)	4.50	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-bind-point-llm-aligning-point-cloud/3d-question-answering-3d-qa-on-3d-mm-vet)](https://paperswithcode.com/sota/3d-question-answering-3d-qa-on-3d-mm-vet?p=point-bind-point-llm-aligning-point-cloud)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-bind-point-llm-aligning-point-cloud/generative-3d-object-classification-on-1)](https://paperswithcode.com/sota/generative-3d-object-classification-on-1?p=point-bind-point-llm-aligning-point-cloud)`

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

1 Sep 2023 · Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng ·

We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video. Guided by ImageBind, we construct a joint embedding space between 3D and multi-modalities, enabling many promising applications, e.g., any-to-3D generation, 3D embedding arithmetic, and 3D open-world understanding. On top of this, we further present Point-LLM, the first 3D large language model (LLM) following 3D multi-modal instructions. By parameter-efficient fine-tuning techniques, Point-LLM injects the semantics of Point-Bind into pre-trained LLMs, e.g., LLaMA, which requires no 3D instruction data, but exhibits superior 3D and multi-modal question-answering capacity. We hope our work may cast a light on the community for extending 3D point clouds to multi-modality applications. Code is available at https://github.com/ZiyuGuo99/Point-Bind_Point-LLM.

PDF Abstract

Code

Add Remove Mark official

ziyuguo99/point-bind_point-llm official

365

openrobotlab/pointllm

383

zrrskywalker/point-bind

365

Pointcept/GPT4Point

254

qizekun/ShapeLLM

Tasks

Add Remove

3D Generation

3D Question Answering (3D-QA)

Generative 3D Object Classification

Instruction Following

Language Modelling

Large Language Model

Question Answering

Datasets

ShapeNet

ModelNet

ESC-50

Objaverse

3D MM-Vet

Results from the Paper

Edit

Ranked #5 on 3D Question Answering (3D-QA) on 3D MM-Vet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Question Answering (3D-QA)	3D MM-Vet	Point-Bind & Point-LLM	Overall Accuracy	23.5	# 5	Compare
Generative 3D Object Classification	Objaverse	Point-Bind LLM	Objaverse (I)	6.00	# 1	Compare
			Objaverse (Average)	5.25	# 6	Compare
			Objaverse (C)	4.50	# 4	Compare

Methods

Add Remove

LLaMA

Edit Social Preview

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove