ICML 2017

Learned Optimizers that Scale and Generalize

ICML 2017 tensorflow/models

Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks.

Efficient softmax approximation for GPUs

ICML 2017 huggingface/transformers

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Learning Important Features Through Propagating Activation Differences

ICML 2017 slundberg/shap

Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input.


Language Modeling with Gated Convolutional Networks

ICML 2017 facebookresearch/fairseq-py

The pre-dominant approach to language modeling to date is based on recurrent neural networks.


Convolutional Sequence to Sequence Learning

ICML 2017 facebookresearch/fairseq-py

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.


Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

ICML 2017 eriklindernoren/Keras-GAN

While humans easily recognize relations between data from different domains without any supervision, learning to automatically discover them is in general very challenging and needs many ground-truth pairs that illustrate the relations.

Conditional Image Synthesis With Auxiliary Classifier GANs

ICML 2017 eriklindernoren/PyTorch-GAN

We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models.


Adaptive Neural Networks for Efficient Inference

ICML 2017 NervanaSystems/distiller

We first pose an adaptive network evaluation scheme, where we learn a system to adaptively choose the components of a deep network to be evaluated for each example.

Asynchronous Stochastic Gradient Descent with Delay Compensation

ICML 2017 microsoft/dmtk

We propose a novel technology to compensate this delay, so as to make the optimization behavior of ASGD closer to that of sequential SGD.

A Distributional Perspective on Reinforcement Learning

ICML 2017 facebookresearch/ReAgent

We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning.