Zero-Shot Object Detection

26 papers with code • 7 benchmarks • 6 datasets

Zero-shot object detection (ZSD) is the task of object detection where no visual training data is available for some of the target object classes.

( Image credit: Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts )

Benchmarks

Add a Result

These leaderboards are used to track progress in Zero-Shot Object Detection

Dataset	Best Model	Compare
MS-COCO	SeeDS	See all
PASCAL VOC'07	SeeDS	See all
LVIS v1.0 minival	OWLv2 (OWL-ST+FT)	See all
LVIS v1.0 val	OWLv2 (OWL-ST+FT)	See all
ODinW	Grounding DINO	See all
MSCOCO	Grounding DINO (without COCO data)	See all
ImageNet Detection	SUZOD	See all

Libraries

Use these libraries to find Zero-Shot Object Detection models and implementations

microsoft/GLIP

2 papers

1,957

Datasets

Latest papers

Most implemented Social Latest No code

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

idea-research/t-rex • 21 Mar 2024

Recognizing the complementary strengths and weaknesses of both text and visual prompts, we introduce T-Rex2 that synergizes both prompts within a single model through contrastive learning.

1,855

21 Mar 2024

Paper
Code

SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food Detection

lancezpf/seeds • • 7 Oct 2023

To tackle this, we propose the Semantic Separable Diffusion Synthesizer (SeeDS) framework for Zero-Shot Food Detection (ZSFD).

07 Oct 2023

Paper
Code

ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data

stanfordmimi/villa • • ICCV 2023

The first key contribution of this work is to demonstrate through systematic evaluations that as the pairwise complexity of the training dataset increases, standard VLMs struggle to learn region-attribute relationships, exhibiting performance degradations of up to 37% on retrieval tasks.

22 Aug 2023

Paper
Code

Scaling Open-Vocabulary Object Detection

google-research/scenic • • NeurIPS 2023

However, with OWL-ST, we can scale to over 1B examples, yielding further large improvement: With an L/14 architecture, OWL-ST improves AP on LVIS rare classes, for which the model has seen no human box annotations, from 31. 2% to 44. 6% (43% relative improvement).

2,994

16 Jun 2023

Paper
Code

Multi-modal Queried Object Detection in the Wild

yifanxu74/mq-det • • NeurIPS 2023

To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.

226

30 May 2023

Paper
Code

DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects for Robotic Grasping

AnasIbrahim/image_agnostic_segmentation • • 6 Apr 2023

In this work, we are interested in open sets where the number of classes is unknown, varying, and without pre-knowledge about the objects' types.

06 Apr 2023

Paper
Code

ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection

casia-iva-lab/zbs • • CVPR 2023

However, previous unsupervised deep learning BGS algorithms perform poorly in sophisticated scenarios such as shadows or night lights, and they cannot detect objects outside the pre-defined categories.

26 Mar 2023

Paper
Code

Efficient Feature Distillation for Zero-shot Annotation Object Detection

dragonlzm/EZSD • • 21 Mar 2023

We propose a new setting for detecting unseen objects called Zero-shot Annotation object Detection (ZAD).

21 Mar 2023

Paper
Code

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

huggingface/transformers • • 9 Mar 2023

To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion.

124,984

09 Mar 2023

Paper
Code

Resolving Semantic Confusions for Improved Zero-Shot Detection

sandipan211/ZSD-SC-Resolver • • British Machine Vision Conference 2022

Zero-shot detection (ZSD) is a challenging task where we aim to recognize and localize objects simultaneously, even when our model has not been trained with visual samples of a few target ("unseen") classes.

12 Dec 2022

Paper
Code

Zero-Shot Object Detection

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result