A Distributional Perspective on Reinforcement Learning

ICML 2017 10 code implementations

We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning.

ATARI GAMES

Random Erasing Data Augmentation

16 Aug 20176 code implementations

In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN).

IMAGE AUGMENTATION IMAGE CLASSIFICATION OBJECT DETECTION PERSON RE-IDENTIFICATION

Fair Regression: Quantitative Definitions and Reduction-based Algorithms

30 May 20191 code implementation

Our schemes only require access to standard risk minimization algorithms (such as standard classification or least-squares regression) while providing theoretical guarantees on the optimality and fairness of the obtained solutions.

REGRESSION

Market Making via Reinforcement Learning

11 Apr 20181 code implementation

Market making is a fundamental trading problem in which an agent provides liquidity by continually offering to buy and sell a security.

Probabilistic Face Embeddings

ICCV 2019 1 code implementation

Embedding methods have achieved success in face recognition by comparing facial features in a latent semantic space.

FACE RECOGNITION

Coupling Adaptive Batch Sizes with Learning Rates

15 Dec 20161 code implementation

The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance of the gradient estimates.

IMAGE CLASSIFICATION STOCHASTIC OPTIMIZATION

The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA

8 Mar 20181 code implementation

During the 2017 NBA playoffs, Celtics coach Brad Stevens was faced with a difficult decision when defending against the Cavaliers: "Do you double and risk giving up easy shots, or stay at home and do the best you can?"

RAIL: Risk-Averse Imitation Learning

20 Jul 20171 code implementation

Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert's behavior is available as a fixed set of trajectories.

AUTONOMOUS DRIVING CONTINUOUS CONTROL IMITATION LEARNING

Separating value functions across time-scales

5 Feb 20191 code implementation

In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.

Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex

6 Jun 20181 code implementation

We show that one cause for such success is due to the fact that the multi-branch architecture is less non-convex in terms of duality gap.