TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Panoptic Scene Graph Generation	PSG Dataset	PSGTR	R@20	28.4	# 4
Panoptic Scene Graph Generation	PSG Dataset	PSGTR	mR@20	16.6	# 5
Panoptic Scene Graph Generation	PSG Dataset	PSGFormer	R@20	18.0	# 8
Panoptic Scene Graph Generation	PSG Dataset	PSGFormer	mR@20	14.8	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/panoptic-scene-graph-generation/panoptic-scene-graph-generation-on-psg)](https://paperswithcode.com/sota/panoptic-scene-graph-generation-on-psg?p=panoptic-scene-graph-generation)`

Panoptic Scene Graph Generation

22 Jul 2022 · Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu ·

Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i.e., objects are detected using bounding boxes followed by prediction of their pairwise relationships. We argue that such a paradigm causes several problems that impede the progress of the field. For instance, bounding box-based labels in current datasets usually contain redundant classes like hairs, and leave out background information that is crucial to the understanding of context. In this work, we introduce panoptic scene graph generation (PSG), a new problem task that requires the model to generate a more comprehensive scene graph representation based on panoptic segmentations rather than rigid bounding boxes. A high-quality PSG dataset, which contains 49k well-annotated overlapping images from COCO and Visual Genome, is created for the community to keep track of its progress. For benchmarking, we build four two-stage baselines, which are modified from classic methods in SGG, and two one-stage baselines called PSGTR and PSGFormer, which are based on the efficient Transformer-based detector, i.e., DETR. While PSGTR uses a set of queries to directly learn triplets, PSGFormer separately models the objects and relations in the form of queries from two Transformer decoders, followed by a prompting-like relation-object matching mechanism. In the end, we share insights on open challenges and future directions.

PDF Abstract

Code

Add Remove Mark official

Jingkang50/OpenPSG official

↳ Quickstart in

Spaces

Replicate

389

Tasks

Add Remove

Benchmarking

Panoptic Scene Graph Generation

Scene Graph Generation

Scene Understanding

Datasets

Introduced in the Paper:

PSG Dataset

Used in the Paper:

MS COCO

Visual Genome

GQA

Results from the Paper

Edit

Ranked #5 on Panoptic Scene Graph Generation on PSG Dataset

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Panoptic Scene Graph Generation	PSG Dataset	PSGTR	R@20	28.4	# 4	Compare
Panoptic Scene Graph Generation	PSG Dataset	PSGTR	mR@20	16.6	# 5	Compare
Panoptic Scene Graph Generation	PSG Dataset	PSGFormer	R@20	18.0	# 8	Compare
Panoptic Scene Graph Generation	PSG Dataset	PSGFormer	mR@20	14.8	# 6	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Panoptic Scene Graph Generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove