Search Results for author: Matthijs T. J. Spaan

Found 15 papers, 1 papers with code

Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications

no code implementations • 26 Mar 2024 • Philip Lippmann, Matthijs T. J. Spaan, Jie Yang

Natural Language Processing (NLP) models optimized for predictive performance often make high confidence errors and suffer from vulnerability to adversarial and out-of-distribution data.

Data Augmentation

Paper
Add Code

When Do Off-Policy and On-Policy Policy Gradient Methods Align?

no code implementations • 19 Feb 2024 • Davide Mambelli, Stephan Bongers, Onno Zoeter, Matthijs T. J. Spaan, Frans A. Oliehoek

A well-established off-policy objective is the excursion objective.

Policy Gradient Methods

Paper
Add Code

Reinforcement Learning by Guided Safe Exploration

no code implementations • 26 Jul 2023 • Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T. J. Spaan

Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Diverse Projection Ensembles for Distributional Reinforcement Learning

no code implementations • 12 Jun 2023 • Moritz A. Zanger, Wendelin Böhmer, Matthijs T. J. Spaan

In contrast to classical reinforcement learning, distributional reinforcement learning algorithms aim to learn the distribution of returns rather than their expected value.

Distributional Reinforcement Learning Inductive Bias +1

Paper
Add Code

The Role of Diverse Replay for Generalisation in Reinforcement Learning

no code implementations • 9 Jun 2023 • Max Weltevrede, Matthijs T. J. Spaan, Wendelin Böhmer

We motivate mathematically and show empirically that generalisation to tasks that are "reachable'' during training is improved by increasing the diversity of transitions in the replay buffer.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

no code implementations • 4 Jun 2023 • Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek

In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.

Paper
Add Code

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

no code implementations • 21 Oct 2022 • Yaniv Oren, Matthijs T. J. Spaan, Wendelin Böhmer

One of the most well-studied and highly performing planning approaches used in Model-Based Reinforcement Learning (MBRL) is Monte-Carlo Tree Search (MCTS).

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

1 code implementation • 1 Jul 2022 • Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.

Paper
Code

Abstraction-Refinement for Hierarchical Probabilistic Models

no code implementations • 6 Jun 2022 • Sebastian Junges, Matthijs T. J. Spaan

The key ideas to accelerate analysis of such programs are (1) to treat the behavior of the subroutine as uncertain and only remove this uncertainty by a detailed analysis if needed, and (2) to abstract similar subroutines into a parametric template, and then analyse this template.

Paper
Add Code

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

no code implementations • 3 Feb 2022 • Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Exploiting Submodular Value Functions For Scaling Up Active Perception

no code implementations • 21 Sep 2020 • Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan

Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function.

Paper
Add Code

Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

no code implementations • 29 Nov 2015 • Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.

Decision Making Decision Making Under Uncertainty

Paper
Add Code

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

no code implementations • 18 Feb 2015 • Frans A. Oliehoek, Matthijs T. J. Spaan, Stefan Witwicki

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents.

Benchmarking

Paper
Add Code

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs

no code implementations • 4 Feb 2014 • Frans Adriaan Oliehoek, Matthijs T. J. Spaan, Christopher Amato, Shimon Whiteson

We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent.

Clustering

Paper
Add Code

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

no code implementations • 1 Aug 2011 • Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan

Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type.

Decision Making Vocal Bursts Type Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.