Search Results for author: Matthijs T. J. Spaan

Found 15 papers, 1 papers with code

Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications

no code implementations26 Mar 2024 Philip Lippmann, Matthijs T. J. Spaan, Jie Yang

Natural Language Processing (NLP) models optimized for predictive performance often make high confidence errors and suffer from vulnerability to adversarial and out-of-distribution data.

Data Augmentation

Reinforcement Learning by Guided Safe Exploration

no code implementations26 Jul 2023 Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T. J. Spaan

Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses.

reinforcement-learning Reinforcement Learning (RL) +2

Diverse Projection Ensembles for Distributional Reinforcement Learning

no code implementations12 Jun 2023 Moritz A. Zanger, Wendelin Böhmer, Matthijs T. J. Spaan

In contrast to classical reinforcement learning, distributional reinforcement learning algorithms aim to learn the distribution of returns rather than their expected value.

Distributional Reinforcement Learning Inductive Bias +1

The Role of Diverse Replay for Generalisation in Reinforcement Learning

no code implementations9 Jun 2023 Max Weltevrede, Matthijs T. J. Spaan, Wendelin Böhmer

We motivate mathematically and show empirically that generalisation to tasks that are "reachable'' during training is improved by increasing the diversity of transitions in the replay buffer.

reinforcement-learning Reinforcement Learning (RL)

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

no code implementations4 Jun 2023 Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek

In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

no code implementations21 Oct 2022 Yaniv Oren, Matthijs T. J. Spaan, Wendelin Böhmer

One of the most well-studied and highly performing planning approaches used in Model-Based Reinforcement Learning (MBRL) is Monte-Carlo Tree Search (MCTS).

Model-based Reinforcement Learning reinforcement-learning +1

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

1 code implementation1 Jul 2022 Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.

Abstraction-Refinement for Hierarchical Probabilistic Models

no code implementations6 Jun 2022 Sebastian Junges, Matthijs T. J. Spaan

The key ideas to accelerate analysis of such programs are (1) to treat the behavior of the subroutine as uncertain and only remove this uncertainty by a detailed analysis if needed, and (2) to abstract similar subroutines into a parametric template, and then analyse this template.

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

no code implementations3 Feb 2022 Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).

Reinforcement Learning (RL)

Exploiting Submodular Value Functions For Scaling Up Active Perception

no code implementations21 Sep 2020 Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan

Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function.

Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

no code implementations29 Nov 2015 Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.

Decision Making Decision Making Under Uncertainty

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

no code implementations18 Feb 2015 Frans A. Oliehoek, Matthijs T. J. Spaan, Stefan Witwicki

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents.

Benchmarking

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs

no code implementations4 Feb 2014 Frans Adriaan Oliehoek, Matthijs T. J. Spaan, Christopher Amato, Shimon Whiteson

We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent.

Clustering

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

no code implementations1 Aug 2011 Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan

Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type.

Decision Making Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.