1 code implementation • 10 May 2024 • Jean Mercat, Igor Vasiljevic, Sedrick Keh, Kushal Arora, Achal Dave, Adrien Gaidon, Thomas Kollar
Linear transformers have emerged as a subquadratic-time alternative to softmax attention and have garnered significant interest due to their fixed-size recurrent state that lowers inference cost.
1 code implementation • 13 Mar 2024 • Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Alexandros G. Dimakis, Gabriel Ilharco, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt
We fit scaling laws that extrapolate in both the number of model parameters and the ratio of training tokens to parameters.
1 code implementation • 25 Jan 2024 • Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick
We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.
no code implementations • 19 Jan 2024 • Matthew Kowal, Achal Dave, Rares Ambrus, Adrien Gaidon, Konstantinos G. Derpanis, Pavel Tokmakov
Concretely, we seek to explain the decision-making process of video transformers based on high-level, spatiotemporal concepts that are automatically discovered.
1 code implementation • 19 Dec 2023 • Cheng-Yen Hsieh, Kaihua Chen, Achal Dave, Tarasha Khurana, Deva Ramanan
Amodal perception, the ability to comprehend complete object structures from partial visibility, is a fundamental skill, even for infants.
no code implementations • 10 Oct 2023 • Wen-Hsuan Chu, Adam W. Harley, Pavel Tokmakov, Achal Dave, Leonidas Guibas, Katerina Fragkiadaki
This begs the question: can we re-purpose these large-scale pre-trained static image models for open-vocabulary video tracking?
no code implementations • 14 Apr 2023 • Rohan Sarkar, Achal Dave, Gerard Medioni, Benjamin Biggs
This paper presents Shape of You (SoY), an approach to improve the accuracy of 3D body shape estimation for vision-based clothing recommendation systems.
Ranked #4 on 3D Human Shape Estimation on SSP-3D
no code implementations • CVPR 2023 • Austin Xu, Mariya I. Vasileva, Achal Dave, Arjun Seshadri
Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets.
1 code implementation • 16 Nov 2022 • Neehar Peri, Achal Dave, Deva Ramanan, Shu Kong
Moreover, semantic classes are often organized within a hierarchy, e. g., tail classes such as child and construction-worker are arguably subclasses of pedestrian.
1 code implementation • 4 Oct 2022 • Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan
Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confound these two motions, making the representation difficult to use for downstream motion planners.
1 code implementation • 25 Sep 2022 • Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, Deva Ramanan
Multiple existing benchmarks involve tracking and segmenting objects in video e. g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e. g. J&F, mAP, sMOTSA).
Ranked #4 on Long-tail Video Object Segmentation on BURST-val (using extra training data)
Long-tail Video Object Segmentation Multi-Object Tracking +6
2 code implementations • 3 May 2022 • Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt
Contrastively trained language-image models such as CLIP, ALIGN, and BASIC have demonstrated unprecedented robustness to multiple challenging natural distribution shifts.
Ranked #94 on Image Classification on ObjectNet (using extra training data)
no code implementations • CVPR 2022 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
A benchmark that would allow us to perform an apple-to-apple comparison of existing efforts is a crucial first step towards advancing this important research field.
Ranked #3 on Open-World Video Segmentation on BURST-val (using extra training data)
no code implementations • 22 Apr 2021 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.
2 code implementations • 1 Feb 2021 • Achal Dave, Piotr Dollár, Deva Ramanan, Alexander Kirillov, Ross Girshick
On one hand, this is desirable as it treats all classes equally.
1 code implementation • ICCV 2021 • Tarasha Khurana, Achal Dave, Deva Ramanan
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
1 code implementation • NeurIPS 2020 • Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets.
Ranked #41 on Domain Generalization on VizWiz-Classification
no code implementations • ECCV 2020 • Achal Dave, Tarasha Khurana, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
To this end, we ask annotators to label objects that move at any point in the video, and give names to them post factum.
no code implementations • 25 Oct 2019 • Achal Dave, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
Moreover, at test time the same network can be applied to detection and tracking, resulting in a unified approach for the two tasks.
no code implementations • 25 Sep 2019 • Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
We conduct a large experimental comparison of various robustness metrics for image classification.
1 code implementation • ICCV 2021 • Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
Additionally, we evaluate three detection models and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
We introduce a systematic framework for quantifying the robustness of classifiers to naturally occurring perturbations of images found in videos.
1 code implementation • 11 Feb 2019 • Achal Dave, Pavel Tokmakov, Deva Ramanan
To address this concern, we propose two new benchmarks for generic, moving object detection, and show that our model matches top-down methods on common categories, while significantly out-performing both top-down and bottom-up methods on never-before-seen categories.
no code implementations • CVPR 2017 • Achal Dave, Olga Russakovsky, Deva Ramanan
While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing.