ScaleDet: A Scalable Multi-Dataset Object Detector

Multi-dataset training provides a viable solution for exploiting heterogeneous large-scale datasets without extra annotation cost. In this work, we propose a scalable multi-dataset detector (ScaleDet) that can scale up its generalization across datasets when increasing the number of training datasets. Unlike existing multi-dataset learners that mostly rely on manual relabelling efforts or sophisticated optimizations to unify labels across datasets, we introduce a simple yet scalable formulation to derive a unified semantic label space for multi-dataset training. ScaleDet is trained by visual-textual alignment to learn the label assignment with label semantic similarities across datasets. Once trained, ScaleDet can generalize well on any given upstream and downstream datasets with seen and unseen classes. We conduct extensive experiments using LVIS, COCO, Objects365, OpenImages as upstream datasets, and 13 datasets from Object Detection in the Wild (ODinW) as downstream datasets. Our results show that ScaleDet achieves compelling strong model performance with an mAP of 50.7 on LVIS, 58.8 on COCO, 46.8 on Objects365, 76.2 on OpenImages, and 71.8 on ODinW, surpassing state-of-the-art detectors with the same backbone.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Results from the Paper


 Ranked #1 on Object Detection on OpenImages-v6 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Object Detection LVIS v1.0 ScaleDet box AP 50.7 # 1
Object Detection MSCOCO ScaleDet AP 58.8 # 1
Object Detection Objects365 ScaleDet AP 46.8 # 1
Object Detection OpenImages-v6 ScaleDet box AP 76.2 # 1

Methods


No methods listed for this paper. Add relevant methods here