1 code implementation • 27 Mar 2024 • Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai
With these findings, we advocate using COCO-ReM for future object detection research.
1 code implementation • 18 Apr 2023 • Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, Ramakrishna Vedantam
Visual and linguistic concepts naturally organize themselves in a hierarchy, where a textual concept "dog" entails all images that contain dogs.
1 code implementation • CVPR 2023 • Mohamed El Banani, Karan Desai, Justin Johnson
Our approach diverges from image-based contrastive learning by sampling view pairs using language similarity instead of hand-crafted augmentations or learned clusters.
1 code implementation • 22 Nov 2021 • Karan Desai, Gaurav Kaul, Zubin Aysola, Justin Johnson
We introduce RedCaps -- a large-scale dataset of 12M image-text pairs collected from Reddit.
no code implementations • CVPR 2021 • Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik
Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining.
3 code implementations • CVPR 2021 • Karan Desai, Justin Johnson
The de-facto approach to many vision tasks is to start from pretrained visual representations, typically learned via supervised training on ImageNet.
Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)
1 code implementation • 24 May 2019 • Vincenzo Lomonaco, Karan Desai, Eugenio Culurciello, Davide Maltoni
High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques.
no code implementations • ICLR 2019 • Ramakrishna Vedantam, Karan Desai, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh
We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable.
2 code implementations • ICCV 2019 • Harsh Agrawal, Karan Desai, YuFei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson
To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task.