TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
LIDAR Semantic Segmentation	nuScenes	PTv3 + PPT	test mIoU	0.830	# 1
LIDAR Semantic Segmentation	nuScenes	PTv3 + PPT	val mIoU	0.812	# 1
Semantic Segmentation	S3DIS	PTv3 + PPT	Mean IoU	80.8	# 1
Semantic Segmentation	S3DIS	PTv3 + PPT	mAcc	87.7	# 2
Semantic Segmentation	S3DIS	PTv3 + PPT	oAcc	92.6	# 1
Semantic Segmentation	S3DIS	PTv3 + PPT	Number of params	24.1M	# 49
Semantic Segmentation	S3DIS Area5	PTv3 + PPT	mIoU	74.7	# 2
Semantic Segmentation	S3DIS Area5	PTv3 + PPT	oAcc	92.0	# 5
Semantic Segmentation	S3DIS Area5	PTv3 + PPT	mAcc	80.1	# 2
Semantic Segmentation	ScanNet	PTv3 + PPT	test mIoU	79.4	# 1
Semantic Segmentation	ScanNet	PTv3 + PPT	val mIoU	78.6	# 1
3D Semantic Segmentation	ScanNet200	PTv3 + PPT	val mIoU	36.0	# 2
3D Semantic Segmentation	ScanNet200	PTv3 + PPT	test mIoU	39.3	# 1
3D Semantic Segmentation	SemanticKITTI	PTv3 + PPT	test mIoU	75.5%	# 1
3D Semantic Segmentation	SemanticKITTI	PTv3 + PPT	val mIoU	72.3%	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/lidar-semantic-segmentation-on-nuscenes)](https://paperswithcode.com/sota/lidar-semantic-segmentation-on-nuscenes?p=point-transformer-v3-simpler-faster-stronger)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis?p=point-transformer-v3-simpler-faster-stronger)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=point-transformer-v3-simpler-faster-stronger)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/3d-semantic-segmentation-on-scannet200)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet200?p=point-transformer-v3-simpler-faster-stronger)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/3d-semantic-segmentation-on-semantickitti)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-semantickitti?p=point-transformer-v3-simpler-faster-stronger)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v3-simpler-faster-stronger/semantic-segmentation-on-s3dis-area5)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis-area5?p=point-transformer-v3-simpler-faster-stronger)`

Point Transformer V3: Simpler, Faster, Stronger

15 Dec 2023 · Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao ·

This paper is not motivated to seek innovation within the attention mechanism. Instead, it focuses on overcoming the existing trade-offs between accuracy and efficiency within the context of point cloud processing, leveraging the power of scale. Drawing inspiration from recent advances in 3D large-scale representation learning, we recognize that model performance is more influenced by scale than by intricate design. Therefore, we present Point Transformer V3 (PTv3), which prioritizes simplicity and efficiency over the accuracy of certain mechanisms that are minor to the overall performance after scaling, such as replacing the precise neighbor search by KNN with an efficient serialized neighbor mapping of point clouds organized with specific patterns. This principle enables significant scaling, expanding the receptive field from 16 to 1024 points while remaining efficient (a 3x increase in processing speed and a 10x improvement in memory efficiency compared with its predecessor, PTv2). PTv3 attains state-of-the-art results on over 20 downstream tasks that span both indoor and outdoor scenarios. Further enhanced with multi-dataset joint training, PTv3 pushes these results to a higher level.

PDF Abstract

Code

Add Remove Mark official

Pointcept/Pointcept official

1,130

pointcept/pointtransformerv3 official

449

facebookresearch/SparseConvNet

1,991

Tasks

Add Remove

3D Semantic Segmentation

LIDAR Semantic Segmentation

Representation Learning

Semantic Segmentation

Datasets

nuScenes

ScanNet

SemanticKITTI

S3DIS

Waymo Open Dataset ScanNet200

Results from the Paper

Add Remove

Ranked #1 on Semantic Segmentation on S3DIS (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
LIDAR Semantic Segmentation	nuScenes	PTv3 + PPT	test mIoU	0.830	# 1	Compare
LIDAR Semantic Segmentation	nuScenes	PTv3 + PPT	val mIoU	0.812	# 1	Compare
Semantic Segmentation	S3DIS	PTv3 + PPT	Mean IoU	80.8	# 1	Compare
			mAcc	87.7	# 2	Compare
			oAcc	92.6	# 1	Compare
			Number of params	24.1M	# 49	Compare
Semantic Segmentation	S3DIS Area5	PTv3 + PPT	mIoU	74.7	# 2	Compare
			oAcc	92.0	# 5	Compare
			mAcc	80.1	# 2	Compare
Semantic Segmentation	ScanNet	PTv3 + PPT	test mIoU	79.4	# 1	Compare
Semantic Segmentation	ScanNet	PTv3 + PPT	val mIoU	78.6	# 1	Compare
3D Semantic Segmentation	ScanNet200	PTv3 + PPT	val mIoU	36.0	# 2	Compare
3D Semantic Segmentation	ScanNet200	PTv3 + PPT	test mIoU	39.3	# 1	Compare
3D Semantic Segmentation	SemanticKITTI	PTv3 + PPT	test mIoU	75.5%	# 1	Compare
3D Semantic Segmentation	SemanticKITTI	PTv3 + PPT	val mIoU	72.3%	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • SPEED • Transformer

Edit Social Preview

Point Transformer V3: Simpler, Faster, Stronger

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove