NeurIPS 2019

XLNet: Generalized Autoregressive Pretraining for Language Understanding

NeurIPS 2019 huggingface/transformers

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

DOCUMENT RANKING LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION

Generating Diverse High-Fidelity Images with VQ-VAE-2

NeurIPS 2019 deepmind/sonnet

We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation.

IMAGE GENERATION

Levenshtein Transformer

NeurIPS 2019 pytorch/fairseq

We further confirm the flexibility of our model by showing a Levenshtein Transformer trained by machine translation can straightforwardly be used for automatic post-editing.

AUTOMATIC POST-EDITING MACHINE TRANSLATION TEXT SUMMARIZATION

Diffusion Improves Graph Learning

NeurIPS 2019 rusty1s/pytorch_geometric

In this work, we remove the restriction of using only the direct neighbors by introducing a powerful, yet spatially localized graph convolution: Graph diffusion convolution (GDC).

NODE CLASSIFICATION

CondConv: Conditionally Parameterized Convolutions for Efficient Inference

NeurIPS 2019 tensorflow/tpu

We demonstrate that scaling networks with CondConv improves the performance and inference cost trade-off of several existing convolutional neural network architectures on both classification and detection tasks.

IMAGE CLASSIFICATION OBJECT DETECTION

Lookahead Optimizer: k steps forward, 1 step back

NeurIPS 2019 rwightman/pytorch-image-models

The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms.

IMAGE CLASSIFICATION MACHINE TRANSLATION STOCHASTIC OPTIMIZATION

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

NeurIPS 2019 tensorflow/lingvo

Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks.

 SOTA for Image Classification on CIFAR-10 (using extra training data)

FINE-GRAINED IMAGE CLASSIFICATION MACHINE TRANSLATION

Large Memory Layers with Product Keys

NeurIPS 2019 facebookresearch/XLM

In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture.

LANGUAGE MODELLING

Cross-lingual Language Model Pretraining

NeurIPS 2019 facebookresearch/XLM

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

Few-shot Video-to-Video Synthesis

NeurIPS 2019 NVlabs/few-shot-vid2vid

To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.

VIDEO-TO-VIDEO SYNTHESIS