Search Results for author: Jakob Uszkoreit

Found 29 papers, 14 papers with code

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

1 code implementation • CVPR 2022 • Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass.

Novel View Synthesis Semantic Segmentation

198

Paper
Code

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

15 code implementations • 18 Jun 2021 • Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +5

29,624

Paper
Code

MLP-Mixer: An all-MLP Architecture for Vision

46 code implementations • NeurIPS 2021 • Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy

Convolutional Neural Networks (CNNs) are the go-to model for computer vision.

Ranked #17 on Image Classification on OmniBenchmark

Image Classification

47,189

Paper
Code

Differentiable Patch Selection for Image Recognition

no code implementations • CVPR 2021 • Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand.

Traffic Sign Recognition

Paper
Add Code

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

143 code implementations • ICLR 2021 • Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

Ranked #1 on Image Classification on CIFAR-10

Classification Document Image Classification +6

124,353

Paper
Code

Towards End-to-End In-Image Neural Machine Translation

no code implementations • EMNLP (nlpbt) 2020 • Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language.

Machine Translation Translation

Paper
Add Code

Object-Centric Learning with Slot Attention

8 code implementations • NeurIPS 2020 • Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.

Object Object Discovery +1

32,716

Paper
Code

An Empirical Study of Generation Order for Machine Translation

no code implementations • EMNLP 2020 • William Chan, Mitchell Stern, Jamie Kiros, Jakob Uszkoreit

In this work, we present an empirical study of generation order for machine translation.

Machine Translation Translation

Paper
Add Code

Scaling Autoregressive Video Models

1 code implementation • ICLR 2020 • Dirk Weissenborn, Oscar Täckström, Jakob Uszkoreit

Due to the statistical complexity of video, the high degree of inherent stochasticity, and the sheer amount of data, generating natural video remains a challenging task.

Ranked #7 on Video Generation on BAIR Robot Pushing

Action Recognition Video Generation +1

Paper
Code

KERMIT: Generative Insertion-Based Modeling for Sequences

no code implementations • 4 Jun 2019 • William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit

During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.

Ranked #39 on Machine Translation on WMT2014 English-German

Machine Translation Question Answering +2

Paper
Add Code

Natural Questions: a Benchmark for Question Answering Research

1 code implementation • Transactions of the Association of Computational Linguistics 2019 • Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov

The public release consists of 307, 373 training examples with single annotations, 7, 830 examples with 5-way annotations for development data, and a further 7, 842 examples 5-way annotated sequestered as test data.

Ranked #7 on Question Answering on Natural Questions (long)

Natural Questions Question Answering

1,557

Paper
Code

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

no code implementations • 8 Feb 2019 • Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations.

Machine Translation Translation +1

Paper
Add Code

Blockwise Parallel Decoding for Deep Autoregressive Models

no code implementations • NeurIPS 2018 • Mitchell Stern, Noam Shazeer, Jakob Uszkoreit

Deep autoregressive sequence-to-sequence models have demonstrated impressive performance across a wide variety of tasks in recent years.

Image Super-Resolution Machine Translation +1

Paper
Add Code

Music Transformer

12 code implementations • ICLR 2019 • Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

This is impractical for long sequences such as musical compositions since their memory complexity for intermediate relative information is quadratic in the sequence length.

Ranked #3 on Music Modeling on JSB Chorales

Music Generation Music Modeling

579

Paper
Code

Universal Transformers

8 code implementations • ICLR 2019 • Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser

Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.

Ranked #30 on Language Modelling on LAMBADA

Inductive Bias LAMBADA +4

14,859

Paper
Code

Tensor2Tensor for Neural Machine Translation

14 code implementations • WS 2018 • Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit

Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.

Machine Translation Translation

14,859

Paper
Code

Fast Decoding in Sequence Models using Discrete Latent Variables

no code implementations • ICML 2018 • Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer

Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.

Machine Translation Translation

Paper
Add Code

Self-Attention with Relative Position Representations

12 code implementations • NAACL 2018 • Peter Shaw, Jakob Uszkoreit, Ashish Vaswani

On the WMT 2014 English-to-German and English-to-French translation tasks, this approach yields improvements of 1. 3 BLEU and 0. 3 BLEU over absolute position representations, respectively.

Ranked #22 on Machine Translation on WMT2014 English-French

Machine Translation Position +1

14,859

Paper
Code

Image Transformer

no code implementations • 15 Feb 2018 • Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem.

Ranked #3 on Density Estimation on CIFAR-10

Density Estimation Image Generation +1

Paper
Add Code

Large Scale Multi-Domain Multi-Task Learning with MultiModel

no code implementations • ICLR 2018 • Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit

We present a single model that yields good results on a number of problems spanning multiple domains.

Image Captioning Image Classification +4

Paper
Add Code

Coarse-to-Fine Question Answering for Long Documents

no code implementations • ACL 2017 • Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alex Lacoste, re, Jonathan Berant

We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.

Question Answering Reading Comprehension +1

Paper
Add Code

One Model To Learn Them All

1 code implementation • 16 Jun 2017 • Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit

We present a single model that yields good results on a number of problems spanning multiple domains.

Image Captioning Image Classification +3

14,859

Paper
Code

Attention Is All You Need

568 code implementations • NeurIPS 2017 • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

Ranked #2 on Multimodal Machine Translation on Multi30K (BLUE (DE-EN) metric)

Abstractive Text Summarization Coreference Resolution +8

124,353

Paper
Code

Neural Paraphrase Identification of Questions with Noisy Pretraining

no code implementations • WS 2017 • Gaurav Singh Tomar, Thyago Duque, Oscar Täckström, Jakob Uszkoreit, Dipanjan Das

We present a solution to the problem of paraphrase identification of questions.

Ranked #15 on Paraphrase Identification on Quora Question Pairs (Accuracy metric)

Paraphrase Identification

Paper
Add Code

Hierarchical Question Answering for Long Documents

no code implementations • 6 Nov 2016 • Eunsol Choi, Daniel Hewlett, Alexandre Lacoste, Illia Polosukhin, Jakob Uszkoreit, Jonathan Berant

We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.

Question Answering Reading Comprehension +1