Search Results for author: Yann Ollivier

Found 31 papers, 9 papers with code

Simple Ingredients for Offline Reinforcement Learning

no code implementations • 19 Mar 2024 • Edoardo Cetin, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric, Yann Ollivier, Ahmed Touati

Offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.

D4RL reinforcement-learning

Paper
Add Code

Does Zero-Shot Reinforcement Learning Exist?

1 code implementation • 29 Sep 2022 • Ahmed Touati, Jérémy Rapin, Yann Ollivier

A zero-shot RL agent is an agent that can solve any RL task in a given environment, instantly with no additional planning or learning, after an initial reward-free learning phase.

Contrastive Learning reinforcement-learning +2

Paper
Code

Agnostic Physics-Driven Deep Learning

no code implementations • 30 May 2022 • Benjamin Scellier, Siddhartha Mishra, Yoshua Bengio, Yann Ollivier

This work establishes that a physical system can perform statistical learning without gradient computations, via an Agnostic Equilibrium Propagation (Aeqprop) procedure that combines energy minimization, homeostatic control, and nudging towards the correct response.

Paper
Add Code

Unbiased Methods for Multi-Goal Reinforcement Learning

no code implementations • 16 Jun 2021 • Léonard Blier, Yann Ollivier

We introduce unbiased deep Q-learning and actor-critic algorithms that can handle such infinitely sparse rewards, and test them in toy environments.

Multi-Goal Reinforcement Learning Q-Learning +2

Paper
Add Code

Learning One Representation to Optimize All Rewards

2 code implementations • NeurIPS 2021 • Ahmed Touati, Yann Ollivier

In the test phase, a reward representation is estimated either from observations or an explicit reward description (e. g., a target state).

Paper
Code

Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint

no code implementations • 18 Jan 2021 • Léonard Blier, Corentin Tallec, Yann Ollivier

In reinforcement learning, temporal difference-based algorithms can be sample-inefficient: for instance, with sparse rewards, no learning occurs until a reward is observed.

Paper
Add Code

Convergence of Online Adaptive and Recurrent Optimization Algorithms

no code implementations • 12 May 2020 • Pierre-Yves Massé, Yann Ollivier

This is more data-agnostic and creates differences with respect to standard SGD theory, especially for the range of possible learning rates.

Paper
Add Code

An Equivalence between Bayesian Priors and Penalties in Variational Inference

no code implementations • 1 Feb 2020 • Pierre Wolinski, Guillaume Charpiat, Yann Ollivier

We fully characterize the regularizers that can arise according to this procedure, and provide a systematic way to compute the prior corresponding to a given penalty.

Variational Inference

Paper
Add Code

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

no code implementations • 29 Aug 2019 • Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou

Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set.

Paper
Add Code

Adversarial Vulnerability of Neural Networks Increases with Input Dimension

no code implementations • ICLR 2019 • Carl-Johann Simon-Gabriel, Yann Ollivier, Léon Bottou, Bernhard Schölkopf, David Lopez-Paz

Over the past four years, neural networks have been proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions.

Paper
Add Code

Separating value functions across time-scales

1 code implementation • 5 Feb 2019 • Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier

In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.

Reinforcement Learning (RL)

Paper
Code

Making Deep Q-learning methods robust to time discretization

1 code implementation • 28 Jan 2019 • Corentin Tallec, Léonard Blier, Yann Ollivier

Despite remarkable successes, Deep Reinforcement Learning (DRL) is not robust to hyperparameterization, implementation details, or small environment changes (Henderson et al. 2017, Zhang et al. 2018).

Q-Learning

Paper
Code

The Extended Kalman Filter is a Natural Gradient Descent in Trajectory Space

no code implementations • 3 Jan 2019 • Yann Ollivier

In principle this makes it possible to treat the underlying trajectory as the parameter of a statistical model of the observations.

Paper
Add Code

Learning with Random Learning Rates

1 code implementation • 2 Oct 2018 • Léonard Blier, Pierre Wolinski, Yann Ollivier

Hyperparameter tuning is a bothersome step in the training of deep learning models.

102

Paper
Code

Learning with Random Learning Rates.

no code implementations • 27 Sep 2018 • Léonard Blier, Pierre Wolinski, Yann Ollivier

Hyperparameter tuning is a bothersome step in the training of deep learning mod- els.

Paper
Add Code

Mixed batches and symmetric discriminators for GAN training

no code implementations • ICML 2018 • Thomas Lucas, Corentin Tallec, Jakob Verbeek, Yann Ollivier

We propose to feed the discriminator with mixed batches of true and fake samples, and train it to predict the ratio of true samples in the batch.

Paper
Add Code

Approximate Temporal Difference Learning is a Gradient Descent for Reversible Policies

no code implementations • 2 May 2018 • Yann Ollivier

In this case, approximate TD is exactly a gradient descent of the \emph{Dirichlet norm}, the norm of the difference of \emph{gradients} between the true and approximate value functions.

Paper
Add Code

Can recurrent neural networks warp time?

1 code implementation • ICLR 2018 • Corentin Tallec, Yann Ollivier

Successful recurrent models such as long short-term memories (LSTMs) and gated recurrent units (GRUs) use ad hoc gating mechanisms.

Paper
Code

The Description Length of Deep Learning Models

no code implementations • NeurIPS 2018 • Léonard Blier, Yann Ollivier

This might explain the relatively poor practical performance of variational methods in deep learning.

Paper
Add Code

First-order Adversarial Vulnerability of Neural Networks and Input Dimension

1 code implementation • ICLR 2019 • Carl-Johann Simon-Gabriel, Yann Ollivier, Léon Bottou, Bernhard Schölkopf, David Lopez-Paz

Over the past few years, neural networks were proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions.

Paper
Code

True Asymptotic Natural Gradient Optimization

no code implementations • 22 Dec 2017 • Yann Ollivier

We introduce a simple algorithm, True Asymptotic Natural Gradient Optimization (TANGO), that converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation.

Paper
Add Code

Natural Langevin Dynamics for Neural Networks

1 code implementation • 4 Dec 2017 • Gaétan Marceau-Caron, Yann Ollivier

The resulting natural Langevin dynamics combines the advantages of Amari's natural gradient descent and Fisher-preconditioned Langevin dynamics for large neural networks.

Paper
Code

Unbiasing Truncated Backpropagation Through Time

no code implementations • ICLR 2018 • Corentin Tallec, Yann Ollivier

Truncated BPTT keeps the computational benefits of Backpropagation Through Time (BPTT) while relieving the need for a complete backtrack through the whole data sequence at every step.

Language Modelling

Paper
Add Code

Online Natural Gradient as a Kalman Filter

no code implementations • 1 Mar 2017 • Yann Ollivier

case, we prove that the joint Kalman filter over states and parameters is a natural gradient on top of real-time recurrent learning (RTRL), a classical algorithm to train recurrent models.

Paper
Add Code

Unbiased Online Recurrent Optimization

1 code implementation • ICLR 2018 • Corentin Tallec, Yann Ollivier

The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for online learning of general recurrent computational graphs such as recurrent network models.

Paper
Code

Practical Riemannian Neural Networks

no code implementations • 25 Feb 2016 • Gaétan Marceau-Caron, Yann Ollivier

We provide the first experimental results on non-synthetic datasets for the quasi-diagonal Riemannian gradient descents for neural networks introduced in [Ollivier, 2015].

Paper
Add Code

Speed learning on the fly

no code implementations • 8 Nov 2015 • Pierre-Yves Massé, Yann Ollivier

The practical performance of online stochastic gradient descent algorithms is highly dependent on the chosen step size, which must be tediously hand-tuned in many applications.

Paper
Add Code

Training recurrent networks online without backtracking

no code implementations • 28 Jul 2015 • Yann Ollivier, Corentin Tallec, Guillaume Charpiat

The evolution of this search direction is partly stochastic and is constructed in such a way to provide, at every time, an unbiased random estimate of the gradient of the loss function with respect to the parameters.

Paper
Add Code

Auto-encoders: reconstruction versus compression

no code implementations • 30 Mar 2014 • Yann Ollivier

We discuss the similarities and differences between training an auto-encoder to minimize the reconstruction error, and training the same auto-encoder to compress the data via a generative model.

Denoising

Paper
Add Code

Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences

no code implementations • 3 Jun 2013 • Yann Ollivier

Recurrent neural networks are powerful models for sequential data, able to represent complex dependencies in the sequence that simpler models such as hidden Markov models cannot handle.

Paper
Add Code

Riemannian metrics for neural networks I: feedforward networks

no code implementations • 4 Mar 2013 • Yann Ollivier

We describe four algorithms for neural network training, each adapted to different scalability constraints.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.