TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Depth-aware Video Panoptic Segmentation	Cityscapes-DVPS	ViP-Deeplab	DVPQ	55.1	# 3
Video Panoptic Segmentation	Cityscapes-VPS	VIP-Deeplab	VPQ	63.1	# 1
Video Panoptic Segmentation	Cityscapes-VPS	VIP-Deeplab	VPQ (thing)	49.5	# 2
Video Panoptic Segmentation	Cityscapes-VPS	VIP-Deeplab	VPQ (stuff)	73.0	# 1
Depth-aware Video Panoptic Segmentation	SemKITTI-DVPS	ViP-Deeplab	DVPQ	45.6	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vip-deeplab-learning-visual-perception-with/video-panoptic-segmentation-on-cityscapes-vps)](https://paperswithcode.com/sota/video-panoptic-segmentation-on-cityscapes-vps?p=vip-deeplab-learning-visual-perception-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vip-deeplab-learning-visual-perception-with/depth-aware-video-panoptic-segmentation-on-1)](https://paperswithcode.com/sota/depth-aware-video-panoptic-segmentation-on-1?p=vip-deeplab-learning-visual-perception-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vip-deeplab-learning-visual-perception-with/depth-aware-video-panoptic-segmentation-on)](https://paperswithcode.com/sota/depth-aware-video-panoptic-segmentation-on?p=vip-deeplab-learning-visual-perception-with)`

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

CVPR 2021 · Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen ·

In this paper, we present ViP-DeepLab, a unified model attempting to tackle the long-standing and challenging inverse projection problem in vision, which we model as restoring the point clouds from perspective image sequences while providing each point with instance-level semantic interpretations. Solving this problem requires the vision models to predict the spatial location, semantic class, and temporally consistent instance label for each 3D point. ViP-DeepLab approaches it by jointly performing monocular depth estimation and video panoptic segmentation. We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public. On the individual sub-tasks, ViP-DeepLab also achieves state-of-the-art results, outperforming previous methods by 5.1% VPQ on Cityscapes-VPS, ranking 1st on the KITTI monocular depth estimation benchmark, and 1st on KITTI MOTS pedestrian. The datasets and the evaluation codes are made publicly available.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

joe-siyuan-qiao/ViP-DeepLab official

212

Tasks

Add Remove

Depth-aware Video Panoptic Segmentation

Depth Estimation

Monocular Depth Estimation

Panoptic Segmentation

Segmentation

Video Panoptic Segmentation

Datasets

Introduced in the Paper:

SemKITTI-DVPS Cityscapes-DVPS

Used in the Paper:

Cityscapes

KITTI

SemanticKITTI

Mapillary Vistas Dataset

KITTI MOTS

Cityscapes-VPS

Results from the Paper

Edit

Ranked #1 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Depth-aware Video Panoptic Segmentation	Cityscapes-DVPS	ViP-Deeplab	DVPQ	55.1	# 3	Compare
Video Panoptic Segmentation	Cityscapes-VPS	VIP-Deeplab	VPQ	63.1	# 1	Compare
			VPQ (thing)	49.5	# 2	Compare
			VPQ (stuff)	73.0	# 1	Compare
Depth-aware Video Panoptic Segmentation	SemKITTI-DVPS	ViP-Deeplab	DVPQ	45.6	# 2	Compare

Methods

Add Remove

ViP-DeepLab

Edit Social Preview

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove