Search Results for author: Pierre H. Richemond

Found 17 papers, 5 papers with code

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

no code implementations9 Feb 2023 Pierre H. Richemond, Allison Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill

With simple linear algebra, we show that when using a linear predictor, the optimal predictor is close to an orthogonal projection, and propose a general framework based on orthonormalization that enables to interpret and give intuition on why BYOL works.

SemPPL: Predicting pseudo-labels for better contrastive representations

2 code implementations12 Jan 2023 Matko Bošnjak, Pierre H. Richemond, Nenad Tomasev, Florian Strub, Jacob C. Walker, Felix Hill, Lars Holger Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations.

Contrastive Learning Pseudo Label

Continuous diffusion for categorical data

no code implementations28 Nov 2022 Sander Dieleman, Laurent Sartran, Arman Roshannai, Nikolay Savinov, Yaroslav Ganin, Pierre H. Richemond, Arnaud Doucet, Robin Strudel, Chris Dyer, Conor Durkan, Curtis Hawthorne, Rémi Leblond, Will Grathwohl, Jonas Adler

Diffusion models have quickly become the go-to paradigm for generative modelling of perceptual signals (such as images and sound) through iterative refinement.

Language Modelling

Categorical SDEs with Simplex Diffusion

no code implementations26 Oct 2022 Pierre H. Richemond, Sander Dieleman, Arnaud Doucet

Diffusion models typically operate in the standard framework of generative modelling by producing continuously-valued datapoints.

Text Generation

Data Distributional Properties Drive Emergent In-Context Learning in Transformers

2 code implementations22 Apr 2022 Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill

In further experiments, we found that naturalistic data distributions were only able to elicit in-context learning in transformers, and not in recurrent models.

Few-Shot Learning In-Context Learning

Zipfian environments for Reinforcement Learning

1 code implementation15 Mar 2022 Stephanie C. Y. Chan, Andrew K. Lampinen, Pierre H. Richemond, Felix Hill

As humans and animals learn in the natural world, they encounter distributions of entities, situations and events that are far from uniform.

reinforcement-learning Reinforcement Learning (RL) +1

Biologically inspired architectures for sample-efficient deep reinforcement learning

no code implementations25 Nov 2019 Pierre H. Richemond, Arinbjörn Kolbeinsson, Yike Guo

Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation.

reinforcement-learning Reinforcement Learning (RL)

How many weights are enough : can tensor factorization learn efficient policies ?

no code implementations25 Sep 2019 Pierre H. Richemond, Arinbjorn Kolbeinsson, Yike Guo

Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation.

reinforcement-learning Reinforcement Learning (RL)

Static Activation Function Normalization

no code implementations3 May 2019 Pierre H. Richemond, Yike Guo

Recent seminal work at the intersection of deep neural networks practice and random matrix theory has linked the convergence speed and robustness of these networks with the combination of random weight initialization and nonlinear activation function in use.

Combining learning rate decay and weight decay with complexity gradient descent - Part I

no code implementations7 Feb 2019 Pierre H. Richemond, Yike Guo

The role of $L^2$ regularization, in the specific case of deep neural networks rather than more traditional machine learning models, is still not fully elucidated.

Diffusing Policies : Towards Wasserstein Policy Gradient Flows

no code implementations ICLR 2018 Pierre H. Richemond, Brendan Maginnis

We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region).

Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients

no code implementations ICLR 2018 Pierre H. Richemond, Brendan Maginnis

Two main families of reinforcement learning algorithms, Q-learning and policy gradients, have recently been proven to be equivalent when using a softmax relaxation on one part, and an entropic regularization on the other.

Q-Learning reinforcement-learning +1

A short variational proof of equivalence between policy gradients and soft Q learning

no code implementations22 Dec 2017 Pierre H. Richemond, Brendan Maginnis

Two main families of reinforcement learning algorithms, Q-learning and policy gradients, have recently been proven to be equivalent when using a softmax relaxation on one part, and an entropic regularization on the other.

Q-Learning reinforcement-learning +1

On Wasserstein Reinforcement Learning and the Fokker-Planck equation

no code implementations19 Dec 2017 Pierre H. Richemond, Brendan Maginnis

We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region).

reinforcement-learning Reinforcement Learning (RL)

Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit

no code implementations ICLR 2018 Brendan Maginnis, Pierre H. Richemond

On tasks with a single output the RWA, RDA and GRU units learn much quicker than the LSTM and with better performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.