Efficient 3D Semantic Segmentation with Superpoint Transformer
We introduce a novel superpoint-based transformer architecture for efficient semantic segmentation of large-scale 3D scenes. Our method incorporates a fast algorithm to partition point clouds into a hierarchical superpoint structure, which makes our preprocessing 7 times faster than existing superpoint-based approaches. Additionally, we leverage a self-attention mechanism to capture the relationships between superpoints at multiple scales, leading to state-of-the-art performance on three challenging benchmark datasets: S3DIS (76.0% mIoU 6-fold validation), KITTI-360 (63.5% on Val), and DALES (79.6%). With only 212k parameters, our approach is up to 200 times more compact than other state-of-the-art models while maintaining similar performance. Furthermore, our model can be trained on a single GPU in 3 hours for a fold of the S3DIS dataset, which is 7x to 70x fewer GPU-hours than the best-performing methods. Our code and models are accessible at github.com/drprojects/superpoint_transformer.
PDF Abstract ICCV 2023 PDF ICCV 2023 AbstractResults from the Paper
Ranked #1 on 3D Semantic Segmentation on S3DIS (mIoU (6-Fold) metric)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
3D Semantic Segmentation | DALES | Superpoint Transformer | mIoU | 79.6 | # 2 | |
Overall Accuracy | 97.5 | # 2 | ||||
Model size | 212K | # 1 | ||||
3D Semantic Segmentation | KITTI-360 | Superpoint Transformer | miou Val | 63.5 | # 1 | |
Model size | 777K | # 1 | ||||
3D Semantic Segmentation | S3DIS | Superpoint Transformer | mIoU (6-Fold) | 76.0 | # 1 | |
mAcc | 85.8 | # 1 | ||||
Semantic Segmentation | S3DIS | Superpoint Transformer | mIoU | 76.0 | # 1 | |
Mean IoU | 76.0 | # 9 | ||||
mAcc | 85.8 | # 5 | ||||
oAcc | 90.4 | # 9 | ||||
Number of params | 0.212M | # 36 | ||||
Params (M) | 0.212 | # 17 | ||||
Semantic Segmentation | S3DIS Area5 | Superpoint Transformer | mIoU | 68.9 | # 27 | |
oAcc | 89.5 | # 22 | ||||
mAcc | 77.3 | # 18 | ||||
Number of params | 212K | # 2 |