Search Results for author: Aravindh Mahendran

Found 15 papers, 8 papers with code

Scaling Vision Transformers to 22 Billion Parameters

1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

The scaling of Transformers has driven breakthrough capabilities for language models.

Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet

Action Classification Fairness +3

192

Paper
Code

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

1 code implementation • 9 Feb 2023 • Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.

Object Object Discovery

32,981

Paper
Code

RUST: Latent Neural Scene Representations from Unposed Imagery

no code implementations • CVPR 2023 • Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, Klaus Greff

Our main insight is that one can train a Pose Encoder that peeks at the target image and learns a latent pose embedding which is used by the decoder for view synthesis.

Decoder Novel View Synthesis

Paper
Add Code

Iterative Patch Selection for High-Resolution Image Recognition

1 code implementation • 24 Oct 2022 • Benjamin Bergner, Christoph Lippert, Aravindh Mahendran

We propose a simple method, Iterative Patch Selection (IPS), which decouples the memory usage from the input size and thus enables the processing of arbitrarily large images under tight hardware constraints.

Autonomous Driving Multiple Instance Learning +2

Paper
Code

SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos

1 code implementation • 15 Jun 2022 • Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf

The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions.

Object Semantic Segmentation

140

Paper
Code

Object Scene Representation Transformer

no code implementations • 14 Jun 2022 • Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.

Decoder Novel View Synthesis +2

Paper
Add Code

Simple Open-Vocabulary Object Detection with Vision Transformers

2 code implementations • 12 May 2022 • Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby

Combining simple architectures with large-scale pre-training has led to massive improvements in image classification.

Ranked #1 on One-Shot Object Detection on MS COCO

Described Object Detection Image Classification +3

3,035

Paper
Code

Conditional Object-Centric Learning from Video

3 code implementations • ICLR 2022 • Thomas Kipf, Gamaleldin F. Elsayed, Aravindh Mahendran, Austin Stone, Sara Sabour, Georg Heigold, Rico Jonschkowski, Alexey Dosovitskiy, Klaus Greff

Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built.

Instance Segmentation Object +3

140

Paper
Code

Differentiable Patch Selection for Image Recognition

no code implementations • CVPR 2021 • Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand.

Traffic Sign Recognition

Paper
Add Code

Representation learning from videos in-the-wild: An object-centric approach

no code implementations • 6 Oct 2020 • Rob Romijnders, Aravindh Mahendran, Michael Tschannen, Josip Djolonga, Marvin Ritter, Neil Houlsby, Mario Lucic

We propose a method to learn image representations from uncurated videos.

Few-Shot Learning Object +3

Paper
Add Code

Object-Centric Learning with Slot Attention

8 code implementations • NeurIPS 2020 • Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.

Object Object Discovery +1

32,992

Paper
Code

Self-Supervised Learning of Video-Induced Visual Invariances

no code implementations • CVPR 2020 • Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Xiaohua Zhai, Neil Houlsby, Sylvain Gelly, Mario Lucic

We propose a general framework for self-supervised learning of transferable visual representations based on Video-Induced Visual Invariances (VIVI).

Ranked #15 on Image Classification on VTAB-1k (using extra training data)

Image Classification Self-Supervised Learning +1

Paper
Add Code

Cross Pixel Optical Flow Similarity for Self-Supervised Learning

no code implementations • 15 Jul 2018 • Aravindh Mahendran, James Thewlis, Andrea Vedaldi

We propose a novel method for learning convolutional neural image representations without manual supervision.

Image Classification Image Segmentation +4

Paper
Add Code

Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images

no code implementations • 7 Dec 2015 • Aravindh Mahendran, Andrea Vedaldi

Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems.

Paper
Add Code

Understanding Deep Image Representations by Inverting Them

8 code implementations • CVPR 2015 • Aravindh Mahendran, Andrea Vedaldi

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system.

168

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.