TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	Cityscapes test	OCNet	Mean IoU (class)	81.7%	# 35
Semantic Segmentation	Trans10K	OCNet	mIoU	66.31%	# 9
Semantic Segmentation	Trans10K	OCNet	GFLOPs	43.31	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ocnet-object-context-network-for-scene/semantic-segmentation-on-trans10k)](https://paperswithcode.com/sota/semantic-segmentation-on-trans10k?p=ocnet-object-context-network-for-scene)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ocnet-object-context-network-for-scene/semantic-segmentation-on-cityscapes)](https://paperswithcode.com/sota/semantic-segmentation-on-cityscapes?p=ocnet-object-context-network-for-scene)`

OCNet: Object Context Network for Scene Parsing

4 Sep 2018 · Yuhui Yuan, Lang Huang, Jianyuan Guo, Chao Zhang, Xilin Chen, Jingdong Wang ·

In this paper, we address the semantic segmentation task with a new context aggregation scheme named \emph{object context}, which focuses on enhancing the role of object information. Motivated by the fact that the category of each pixel is inherited from the object it belongs to, we define the object context for each pixel as the set of pixels that belong to the same category as the given pixel in the image. We use a binary relation matrix to represent the relationship between all pixels, where the value one indicates the two selected pixels belong to the same category and zero otherwise. We propose to use a dense relation matrix to serve as a surrogate for the binary relation matrix. The dense relation matrix is capable to emphasize the contribution of object information as the relation scores tend to be larger on the object pixels than the other pixels. Considering that the dense relation matrix estimation requires quadratic computation overhead and memory consumption w.r.t. the input size, we propose an efficient interlaced sparse self-attention scheme to model the dense relations between any two of all pixels via the combination of two sparse relation matrices. To capture richer context information, we further combine our interlaced sparse self-attention scheme with the conventional multi-scale context schemes including pyramid pooling~\citep{zhao2017pyramid} and atrous spatial pyramid pooling~\citep{chen2018deeplab}. We empirically show the advantages of our approach with competitive performances on five challenging benchmarks including: Cityscapes, ADE20K, LIP, PASCAL-Context and COCO-Stuff

PDF Abstract

Code

Add Remove Mark official

PkuRainBow/OCNet official

813

PkuRainBow/OCNet.pytorch official

813

open-mmlab/mmsegmentation

↳ Quickstart in

Colab

7,387

openseg-group/openseg.pytorch

1,173

openseg-group/OCNet.pytorch

813

See all 8 implementations

Tasks

Add Remove

Object

Relation

Scene Parsing

Semantic Segmentation

Datasets

Cityscapes

ADE20K

LIP Trans10K

Results from the Paper

Edit

Ranked #9 on Semantic Segmentation on Trans10K

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	Cityscapes test	OCNet	Mean IoU (class)	81.7%	# 35	Compare
Semantic Segmentation	Trans10K	OCNet	mIoU	66.31%	# 9	Compare
Semantic Segmentation	Trans10K	OCNet	GFLOPs	43.31	# 9	Compare

Methods

Add Remove

Auxiliary Classifier • Average Pooling • Batch Normalization • Convolution • Dilated Convolution • PSPNet • Pyramid Pooling Module • ReLU

Edit Social Preview

OCNet: Object Context Network for Scene Parsing

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove