TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Robust 3D Semantic Segmentation	nuScenes-C	MinkUNet-34	mean Corruption Error (mCE)	96.37%	# 2
Robust 3D Semantic Segmentation	nuScenes-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 5
Semantic Segmentation	S3DIS	MinkowskiNet	Mean IoU	65.4	# 35
Semantic Segmentation	S3DIS	MinkowskiNet	Number of params	37.9M	# 50
Semantic Segmentation	S3DIS	MinkowskiNet	Params (M)	37.9	# 3
Semantic Segmentation	S3DIS Area5	MinkowskiNet	mIoU	65.4	# 36
Semantic Segmentation	S3DIS Area5	MinkowskiNet	mAcc	71.7	# 29
Semantic Segmentation	S3DIS Area5	MinkowskiNet	Number of params	37.9M	# 52
Semantic Segmentation	ScanNet	MinkowskiNet	test mIoU	73.4	# 15
Semantic Segmentation	ScanNet	MinkowskiNet	val mIoU	72.2	# 17
3D Semantic Segmentation	ScanNet++	MinkowskiNet	Top-1 IoU	0.292	# 2
3D Semantic Segmentation	ScanNet200	MinkUNet	val mIoU	25.0	# 9
3D Semantic Segmentation	ScanNet200	MinkUNet	test mIoU	25.3	# 7
3D Semantic Segmentation	ScribbleKITTI	MinkowskiNet	mIoU	55.0	# 3
Robust 3D Semantic Segmentation	SemanticKITTI-C	MinkUNet-34	mean Corruption Error (mCE)	100.61%	# 5
Robust 3D Semantic Segmentation	SemanticKITTI-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 3
Robust 3D Semantic Segmentation	WOD-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 3
Robust 3D Semantic Segmentation	WOD-C	MinkUNet-34	mean Corruption Error (mCE)	96.21%	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/robust-3d-semantic-segmentation-on-wod-c)](https://paperswithcode.com/sota/robust-3d-semantic-segmentation-on-wod-c?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/robust-3d-semantic-segmentation-on-nuscenes-c)](https://paperswithcode.com/sota/robust-3d-semantic-segmentation-on-nuscenes-c?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/3d-semantic-segmentation-on-scannet-1)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet-1?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/3d-semantic-segmentation-on-scribblekitti)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scribblekitti?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/robust-3d-semantic-segmentation-on)](https://paperswithcode.com/sota/robust-3d-semantic-segmentation-on?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/3d-semantic-segmentation-on-scannet200)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet200?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis?p=4d-spatio-temporal-convnets-minkowski)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/4d-spatio-temporal-convnets-minkowski/semantic-segmentation-on-s3dis-area5)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis-area5?p=4d-spatio-temporal-convnets-minkowski)`

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

CVPR 2019 · Christopher Choy, JunYoung Gwak, Silvio Savarese ·

In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos using high-dimensional convolutions. For this, we adopt sparse tensors and propose the generalized sparse convolution that encompasses all discrete convolutions. To implement the generalized sparse convolution, we create an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks. We create 4D spatio-temporal convolutional neural networks using the library and validate them on various 3D semantic segmentation benchmarks and proposed 4D datasets for 3D-video perception. To overcome challenges in the 4D space, we propose the hybrid kernel, a special case of the generalized sparse convolution, and the trilateral-stationary conditional random field that enforces spatio-temporal consistency in the 7D space-time-chroma space. Experimentally, we show that convolutional neural networks with only generalized 3D sparse convolutions can outperform 2D or 2D-3D hybrid methods by a large margin. Also, we show that on 3D-videos, 4D spatio-temporal convolutional neural networks are robust to noise, outperform 3D convolutional neural networks and are faster than the 3D counterpart in some cases.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Code

Add Remove Mark official

StanfordVL/MinkowskiEngine official

2,293

NVIDIA/MinkowskiEngine

2,293

Pointcept/Pointcept

1,119

mit-han-lab/spvnas

↳ Quickstart in

Colab

571

ldkong1205/Robo3D

272

See all 7 implementations

Tasks

Add Remove

3D Semantic Segmentation

4D Spatio Temporal Semantic Segmentation

Robust 3D Semantic Segmentation

Semantic Segmentation

Datasets

ScanNet

S3DIS

STPLS3D nuScenes-C ScanNet200

SemanticKITTI-C

ScribbleKITTI ScanNet++

WOD-C

Results from the Paper

Edit

Ranked #1 on Robust 3D Semantic Segmentation on WOD-C

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Robust 3D Semantic Segmentation	nuScenes-C	MinkUNet-34	mean Corruption Error (mCE)	96.37%	# 2	Compare
Robust 3D Semantic Segmentation	nuScenes-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 5	Compare
Semantic Segmentation	S3DIS	MinkowskiNet	Mean IoU	65.4	# 35	Compare
			Number of params	37.9M	# 50	Compare
			Params (M)	37.9	# 3	Compare
Semantic Segmentation	S3DIS Area5	MinkowskiNet	mIoU	65.4	# 36	Compare
			mAcc	71.7	# 29	Compare
			Number of params	37.9M	# 52	Compare
Semantic Segmentation	ScanNet	MinkowskiNet	test mIoU	73.4	# 15	Compare
Semantic Segmentation	ScanNet	MinkowskiNet	val mIoU	72.2	# 17	Compare
3D Semantic Segmentation	ScanNet++	MinkowskiNet	Top-1 IoU	0.292	# 2	Compare
3D Semantic Segmentation	ScanNet200	MinkUNet	val mIoU	25.0	# 9	Compare
3D Semantic Segmentation	ScanNet200	MinkUNet	test mIoU	25.3	# 7	Compare
3D Semantic Segmentation	ScribbleKITTI	MinkowskiNet	mIoU	55.0	# 3	Compare
Robust 3D Semantic Segmentation	SemanticKITTI-C	MinkUNet-34	mean Corruption Error (mCE)	100.61%	# 5	Compare
Robust 3D Semantic Segmentation	SemanticKITTI-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 3	Compare
Robust 3D Semantic Segmentation	WOD-C	MinkUNet-18	mean Corruption Error (mCE)	100.00%	# 3	Compare
Robust 3D Semantic Segmentation	WOD-C	MinkUNet-34	mean Corruption Error (mCE)	96.21%	# 1	Compare

Methods

Add Remove

Convolution • Sparse Convolutions • Submanifold Convolution

Edit Social Preview

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove