no code implementations • 29 Feb 2024 • Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney
Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution.
no code implementations • 5 Feb 2024 • Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad
First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games), uncovering behavioral patterns in a complex task space.
no code implementations • 16 Oct 2023 • Thomas Jiralerspong, Flemming Kondrup, Doina Precup, Khimya Khetarpal
The ability to plan at many different levels of abstraction enables agents to envision the long-term repercussions of their decisions and thus enables sample-efficient learning.
1 code implementation • 27 Apr 2023 • Somjit Nath, Gopeshh Raaj Subbaraj, Khimya Khetarpal, Samira Ebrahimi Kahou
Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards.
no code implementations • 30 Dec 2022 • Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy
We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task.
1 code implementation • 24 Jan 2022 • Andrei Nica, Khimya Khetarpal, Doina Precup
Decision-making AI agents are often faced with two important challenges: the depth of the planning horizon, and the branching factor due to having many choices.
1 code implementation • NeurIPS 2021 • Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, Doina Precup
Humans and animals have the ability to reason and make predictions about different courses of action at many time scales.
3 code implementations • 2 Aug 2021 • Fabrice Normandin, Florian Golemo, Oleksiy Ostapenko, Pau Rodriguez, Matthew D Riemer, Julio Hurtado, Khimya Khetarpal, Ryan Lindeborg, Lucas Cecchi, Timothée Lesort, Laurent Charlin, Irina Rish, Massimo Caccia
We propose a taxonomy of settings, where each setting is described as a set of assumptions.
1 code implementation • 3 Feb 2021 • Arushi Jain, Gandharv Patil, Ayush Jain, Khimya Khetarpal, Doina Precup
Reinforcement learning algorithms are typically geared towards optimizing the expected return of an agent.
no code implementations • 25 Dec 2020 • Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup
In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL.
2 code implementations • ICLR 2021 • Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau
Further, we provide transfer and generalization bounds based on task and state similarity, along with sample complexity bounds that depend on the aggregate number of samples across tasks, rather than the number of tasks, a significant improvement over prior work that use the same environment assumptions.
1 code implementation • ICML 2020 • Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, David Abel, Doina Precup
Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents.
3 code implementations • 1 Jan 2020 • Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup
Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time.
2 code implementations • 26 Nov 2018 • Khimya Khetarpal, Shagun Sodhani, Sarath Chandar, Doina Precup
To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific task but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned.
1 code implementation • 25 Jul 2018 • Khimya Khetarpal, Doina Precup
When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant information and sequentially combining it to build a representation from the sensory data.
1 code implementation • 21 Jul 2018 • Arushi Jain, Khimya Khetarpal, Doina Precup
We propose an optimization objective that learns safe options by encouraging the agent to visit states with higher behavioural consistency.