Scene Graph Generation

110 papers with code • 5 benchmarks • 7 datasets

A scene graph is a structured representation of an image, where nodes in a scene graph correspond to object bounding boxes with their object categories, and edges correspond to their pairwise relationships between objects. The task of Scene Graph Generation is to generate a visually-grounded scene graph that most accurately correlates with an image.

Source: Scene Graph Generation by Iterative Message Passing

Libraries

Use these libraries to find Scene Graph Generation models and implementations

Adaptive Visual Scene Understanding: Incremental Scene Graph Generation

zhanglab-deepneurocoglab/csegg 2 Oct 2023

To address the lack of continual learning methodologies in SGG, we introduce the comprehensive Continual ScenE Graph Generation (CSEGG) dataset along with 3 learning scenarios and 8 evaluation metrics.

4
02 Oct 2023

Less is More: Toward Zero-Shot Local Scene Graph Generation via Foundation Models

bowen-upenn/Multi-Agent-VQA 2 Oct 2023

To fill this gap, we present a new task called Local Scene Graph Generation.

2
02 Oct 2023

Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation

hcplab-sysu/stket 23 Sep 2023

In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.

7
23 Sep 2023

Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction

jkli1998/t-car 7 Sep 2023

In our framework, a triplet calibration loss is first presented to regularize the representations of diverse triplets and to simultaneously excavate the unseen triplets in incompletely annotated training scene graphs.

7
07 Sep 2023

Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding

joshuafeinglass/vl-detector-eval 1 Sep 2023

Object proposal generation serves as a standard pre-processing step in Vision-Language (VL) tasks (image captioning, visual question answering, etc.).

4
01 Sep 2023

Head-Tail Cooperative Learning Network for Unbiased Scene Graph Generation

wanglei0618/htcl 23 Aug 2023

We also propose a self-supervised learning approach to enhance the prediction ability of the tail-prefer feature representation branch by constraining tail-prefer predicate features.

2
23 Aug 2023

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

jacobyuan7/rlipv2 ICCV 2023

In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.

93
18 Aug 2023

Vision Relation Transformer for Unbiased Scene Graph Generation

visinf/veto ICCV 2023

Recent years have seen a growing interest in Scene Graph Generation (SGG), a comprehensive visual scene understanding task that aims to predict entity relationships using a relation encoder-decoder pipeline stacked on top of an object encoder-decoder backbone.

19
18 Aug 2023

Compositional Feature Augmentation for Unbiased Scene Graph Generation

hkust-longgroup/cfa ICCV 2023

Specifically, we first decompose each relation triplet feature into two components: intrinsic feature and extrinsic feature, which correspond to the intrinsic characteristics and extrinsic contexts of a relation triplet, respectively.

12
13 Aug 2023

Environment-Invariant Curriculum Relation Learning for Fine-Grained Scene Graph Generation

myukzzz/eicr ICCV 2023

Then, we construct a class-balanced curriculum learning strategy to balance the different environments to remove the predicate imbalance.

5
07 Aug 2023