TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	S3DIS	PonderV2 + SparseUNet	Mean IoU	79.9	# 2
Semantic Segmentation	S3DIS	PonderV2 + SparseUNet	mAcc	86.5	# 3
Semantic Segmentation	S3DIS	PonderV2 + SparseUNet	oAcc	92.5	# 2
Semantic Segmentation	S3DIS Area5	PonderV2 + SparseUNet	mIoU	73.2	# 7
Semantic Segmentation	S3DIS Area5	PonderV2 + SparseUNet	oAcc	92.2	# 3
Semantic Segmentation	S3DIS Area5	PonderV2 + SparseUNet	mAcc	79.0	# 4
Semantic Segmentation	ScanNet	PonderV2	test mIoU	78.5	# 2
Semantic Segmentation	ScanNet	PonderV2	val mIoU	77.0	# 4
3D Semantic Segmentation	ScanNet++	PonderV2-SparseUNet-base	Top-1 IoU	0.386	# 1
3D Semantic Segmentation	ScanNet200	PonderV2 + SparseUNet	val mIoU	32.3	# 5
3D Semantic Segmentation	ScanNet200	PonderV2 + SparseUNet	test mIoU	34.6	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ponderv2-pave-the-way-for-3d-foundataion/3d-semantic-segmentation-on-scannet-1)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet-1?p=ponderv2-pave-the-way-for-3d-foundataion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ponderv2-pave-the-way-for-3d-foundataion/semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis?p=ponderv2-pave-the-way-for-3d-foundataion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ponderv2-pave-the-way-for-3d-foundataion/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=ponderv2-pave-the-way-for-3d-foundataion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ponderv2-pave-the-way-for-3d-foundataion/3d-semantic-segmentation-on-scannet200)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet200?p=ponderv2-pave-the-way-for-3d-foundataion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ponderv2-pave-the-way-for-3d-foundataion/semantic-segmentation-on-s3dis-area5)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis-area5?p=ponderv2-pave-the-way-for-3d-foundataion)`

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

12 Oct 2023 · Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Tong He, Wanli Ouyang ·

In contrast to numerous NLP and 2D vision foundational models, learning a 3D foundational model poses considerably greater challenges. This is primarily due to the inherent data variability and diversity of downstream tasks. In this paper, we introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation, thereby establishing a pathway to 3D foundational models. Considering that informative 3D features should encode rich geometry and appearance cues that can be utilized to render realistic images, we propose to learn 3D representations by differentiable neural rendering. We train a 3D backbone with a devised volumetric neural renderer by comparing the rendered with the real images. Notably, our approach seamlessly integrates the learned 3D encoder into various downstream tasks. These tasks encompass not only high-level challenges such as 3D detection and segmentation but also low-level objectives like 3D reconstruction and image synthesis, spanning both indoor and outdoor scenarios. Besides, we also illustrate the capability of pre-training a 2D backbone using the proposed methodology, surpassing conventional pre-training methods by a large margin. For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness. Code and models are available at https://github.com/OpenGVLab/PonderV2.

PDF Abstract

Code

Add Remove Mark official

OpenGVLab/PonderV2 official

298

Tasks

Add Remove

3D Object Detection

3D Reconstruction

3D Semantic Segmentation

Image Generation

LIDAR Semantic Segmentation

Neural Rendering

Point Cloud Pre-training

Semantic Segmentation

Datasets

nuScenes

ScanNet

SUN RGB-D

S3DIS

Structured3D ScanNet200 ScanNet++

Results from the Paper

Edit

Ranked #1 on 3D Semantic Segmentation on ScanNet++ (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	S3DIS	PonderV2 + SparseUNet	Mean IoU	79.9	# 2	Compare
			mAcc	86.5	# 3	Compare
			oAcc	92.5	# 2	Compare
Semantic Segmentation	S3DIS Area5	PonderV2 + SparseUNet	mIoU	73.2	# 7	Compare
			oAcc	92.2	# 3	Compare
			mAcc	79.0	# 4	Compare
Semantic Segmentation	ScanNet	PonderV2	test mIoU	78.5	# 2	Compare
Semantic Segmentation	ScanNet	PonderV2	val mIoU	77.0	# 4	Compare
3D Semantic Segmentation	ScanNet++	PonderV2-SparseUNet-base	Top-1 IoU	0.386	# 1	Compare
3D Semantic Segmentation	ScanNet200	PonderV2 + SparseUNet	val mIoU	32.3	# 5	Compare
3D Semantic Segmentation	ScanNet200	PonderV2 + SparseUNet	test mIoU	34.6	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove