Search Results for author: André Susano Pinto

Found 10 papers, 5 papers with code

LocCa: Visual Pretraining with Location-aware Captioners

no code implementations • 28 Mar 2024 • Bo Wan, Michael Tschannen, Yongqin Xian, Filip Pavetic, Ibrahim Alabdulmohsin, Xiao Wang, André Susano Pinto, Andreas Steiner, Lucas Beyer, Xiaohua Zhai

In this paper, we propose a simple visual pretraining method with location-aware captioners (LocCa).

Image Captioning

Paper
Add Code

A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

1 code implementation • 30 Mar 2023 • Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai

A key finding is that a small decoder learned on top of a frozen pretrained encoder works surprisingly well.

Multi-Task Learning Optical Character Recognition +3

1,567

Paper
Code

Tuning computer vision models with task rewards

1 code implementation • 16 Feb 2023 • André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models.

Colorization Image Captioning +5

1,567

Paper
Code

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

1 code implementation • 20 May 2022 • Alexander Kolesnikov, André Susano Pinto, Lucas Beyer, Xiaohua Zhai, Jeremiah Harmsen, Neil Houlsby

We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks.

Colorization Depth Estimation +4

1,567

Paper
Code

Learning to Merge Tokens in Vision Transformers

1 code implementation • 24 Feb 2022 • Cedric Renggli, André Susano Pinto, Neil Houlsby, Basil Mustafa, Joan Puigcerver, Carlos Riquelme

Transformers are widely applied to solve natural language understanding and computer vision tasks.

Natural Language Understanding

Paper
Code

Scaling Vision with Sparse Mixture of Experts

1 code implementation • NeurIPS 2021 • Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, Neil Houlsby

We present a Vision MoE (V-MoE), a sparse version of the Vision Transformer, that is scalable and competitive with the largest dense networks.

Ranked #1 on Few-Shot Image Classification on ImageNet - 5-shot

Few-Shot Image Classification

513

Paper
Code

Deep Ensembles for Low-Data Transfer Learning

no code implementations • 14 Oct 2020 • Basil Mustafa, Carlos Riquelme, Joan Puigcerver, André Susano Pinto, Daniel Keysers, Neil Houlsby

In the low-data regime, it is difficult to train good supervised models from scratch.

Ranked #6 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

Which Model to Transfer? Finding the Needle in the Growing Haystack

no code implementations • CVPR 2022 • Cedric Renggli, André Susano Pinto, Luka Rimanic, Joan Puigcerver, Carlos Riquelme, Ce Zhang, Mario Lucic

Transfer learning has been recently popularized as a data-efficient alternative to training models from scratch, in particular for computer vision tasks where it provides a remarkably solid baseline.

Transfer Learning

Paper
Add Code

Training general representations for remote sensing using in-domain knowledge

no code implementations • 30 Sep 2020 • Maxim Neumann, André Susano Pinto, Xiaohua Zhai, Neil Houlsby

Automatically finding good and general remote sensing representations allows to perform transfer learning on a wide range of applications - improving the accuracy and reducing the required number of training samples.

Representation Learning Transfer Learning

Paper
Add Code

Scalable Transfer Learning with Expert Models

no code implementations • ICLR 2021 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

We explore the use of expert representations for transfer with a simple, yet effective, strategy.

Ranked #11 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.