Search Results for author: Harm van Seijen

Found 18 papers, 11 papers with code

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

1 code implementation • 30 Sep 2023 • Mingde Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio

Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations.

Decision Making Model-based Reinforcement Learning +2

Paper
Code

Replay Buffer with Local Forgetting for Adapting to Local Environment Changes in Deep Model-Based Reinforcement Learning

no code implementations • 15 Mar 2023 • Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Harm van Seijen, Sarath Chandar

This is challenging for deep-learning-based world models due to catastrophic forgetting.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

1 code implementation • 31 Oct 2022 • Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process, which is prevalent in practical applications.

Offline RL Reinforcement Learning (RL) +1

Paper
Code

Modular Lifelong Reinforcement Learning via Neural Composition

1 code implementation • ICLR 2022 • Jorge A. Mendez, Harm van Seijen, Eric Eaton

Empirically, we demonstrate that neural composition indeed captures the underlying structure of this space of problems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods

1 code implementation • 25 Apr 2022 • Yi Wan, Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Sarath Chandar, Harm van Seijen

We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

One-Shot Learning from a Demonstration with Hierarchical Latent Language

no code implementations • 9 Mar 2022 • Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew Hausknecht, Romain Laroche, Ida Momennejad, Harm van Seijen, Benjamin Van Durme

Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration.

One-Shot Learning

Paper
Add Code

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

1 code implementation • 13 Jul 2021 • Sungryull Sohn, Sungtae Lee, Jongwook Choi, Harm van Seijen, Mehdi Fatemi, Honglak Lee

We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent's trajectory that improves the sample efficiency in sparse-reward MDPs.

Continuous Control reinforcement-learning +1

Paper
Code

Systematic generalisation with group invariant predictions

no code implementations • ICLR 2021 • Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.

Paper
Add Code

A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms

1 code implementation • 2 Oct 2020 • Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes

In the second scenario, we consider optimizing a discounted objective ($\gamma < 1$) and propose to interpret the omission of the discounting in the actor update from an auxiliary task perspective and provide supporting empirical results.

Representation Learning

3,098

Paper
Code

The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning

2 code implementations • NeurIPS 2020 • Harm van Seijen, Hadi Nekoei, Evan Racah, Sarath Chandar

For example, the common single-task sample-efficiency metric conflates improvements due to model-based learning with various other aspects, such as representation learning, making it difficult to assess true progress on model-based RL.

Model-based Reinforcement Learning Reinforcement Learning (RL) +1

Paper
Code

Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning

2 code implementations • NeurIPS 2019 • Harm van Seijen, Mehdi Fatemi, Arash Tavakoli

In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation.

General Reinforcement Learning reinforcement-learning +1

Paper
Code

Learning Invariances for Policy Generalization

1 code implementation • 7 Sep 2018 • Remi Tachet, Philip Bachman, Harm van Seijen

While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks.

BIG-bench Machine Learning Data Augmentation +3

Paper
Code

Hybrid Reward Architecture for Reinforcement Learning

1 code implementation • NeurIPS 2017 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, Jeffrey Tsang

One of the main challenges in reinforcement learning (RL) is generalisation.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Multi-Advisor Reinforcement Learning

no code implementations • ICLR 2018 • Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen

We consider tackling a single-agent RL problem by distributing it to $n$ learners.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Separation of Concerns in Reinforcement Learning

no code implementations • 15 Dec 2016 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche

In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Effective Multi-step Temporal-Difference Learning for Non-Linear Function Approximation

no code implementations • 18 Aug 2016 • Harm van Seijen

Furthermore, based on our analysis, we propose a new multi-step TD method for non-linear function approximation that addresses this issue.

Paper
Add Code

True Online Temporal-Difference Learning

1 code implementation • 13 Dec 2015 • Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton

Our results suggest that the true online methods indeed dominate the regular methods.

Atari Games

Paper
Code

An Empirical Evaluation of True Online TD(λ)

no code implementations • 1 Jul 2015 • Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton

Our results confirm the strength of true online TD({\lambda}): 1) for sparse feature vectors, the computational overhead with respect to TD({\lambda}) is minimal; for non-sparse features the computation time is at most twice that of TD({\lambda}), 2) across all domains/representations the learning speed of true online TD({\lambda}) is often better, but never worse than that of TD({\lambda}), and 3) true online TD({\lambda}) is easier to use, because it does not require choosing between trace types, and it is generally more stable with respect to the step-size.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.