no code implementations • 13 Dec 2022 • Olivier Moulin, Vincent Francois-Lavet, Mark Hoogendoorn
An eco-system of agents each having their own policy with some, but limited, generalizability has proven to be a reliable approach to increase generalization across procedurally generated environments.
no code implementations • 13 Apr 2022 • Olivier Moulin, Vincent Francois-Lavet, Paul Elbers, Mark Hoogendoorn
Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment.
1 code implementation • 4 Apr 2022 • Taewoon Kim, Michael Cochez, Vincent Francois-Lavet, Mark Neerincx, Piek Vossen
Inspired by the cognitive science theory, we explicitly model an agent with both semantic and episodic memory systems, and show that it is better than having just one of the two memory systems.
1 code implementation • 22 Nov 2021 • Geoffrey van Driessel, Vincent Francois-Lavet
We learn a low-dimensional encoding of the environment, meant to capture summarizing abstractions, from which the internal dynamics and value functions are learned.
no code implementations • 28 Sep 2021 • Amjad Yousef Majid, Serge Saaybi, Tomas van Rietbergen, Vincent Francois-Lavet, R Venkatesha Prasad, Chris Verhoeven
Deep Reinforcement Learning (DRL) and Evolution Strategies (ESs) have surpassed human-level control in many sequential decision-making problems, yet many open challenges still exist.
no code implementations • 2 Mar 2020 • Stefano Alletto, Shenyang Huang, Vincent Francois-Lavet, Yohei Nakata, Guillaume Rabusseau
Almost all neural architecture search methods are evaluated in terms of performance (i. e. test accuracy) of the model structures that it finds.
3 code implementations • 30 Nov 2018 • Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning.
1 code implementation • 9 May 2018 • Joshua Romoff, Peter Henderson, Alexandre Piché, Vincent Francois-Lavet, Joelle Pineau
However, introduction of corrupt or stochastic rewards can yield high variance in learning.
no code implementations • 22 Sep 2017 • Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau
This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability.