Search Results for author: Vincent Thomas

Found 8 papers, 0 papers with code

How to Exhibit More Predictable Behaviors

no code implementations17 Apr 2024 Salomé Lepers, Sophie Lemonnier, Vincent Thomas, Olivier Buffet

This paper looks at predictability problems, i. e., wherein an agent must choose its strategy in order to optimize the predictions that an external observer could make.

Monte-Carlo Search for an Equilibrium in Dec-POMDPs

no code implementations19 May 2023 Yang You, Vincent Thomas, Francis Colas, Olivier Buffet

Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability.

Robust Robot Planning for Human-Robot Collaboration

no code implementations27 Feb 2023 Yang You, Vincent Thomas, Francis Colas, Rachid Alami, Olivier Buffet

Based on this, we propose two contributions: 1) an approach to automatically generate an uncertain human behavior (a policy) for each given objective function while accounting for possible robot behaviors; and 2) a robot planning algorithm that is robust to the above-mentioned uncertainties and relies on solving a partially observable Markov decision process (POMDP) obtained by reasoning on a distribution over human behaviors.

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

no code implementations17 Sep 2021 Yang You, Vincent Thomas, Francis Colas, Olivier Buffet

This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i. e., situations where each agent's policy is a best response to the other agents' (fixed) policies.

Monte Carlo Information-Oriented Planning

no code implementations21 Mar 2021 Vincent Thomas, Gérémy Hutin, Olivier Buffet

In this article, we discuss how to solve information-gathering problems expressed as rho-POMDPs, an extension of Partially Observable Markov Decision Processes (POMDPs) whose reward rho depends on the belief state.

On Bellman's Optimality Principle for zs-POSGs

no code implementations29 Jun 2020 Olivier Buffet, Jilles Dibangoye, Aurélien Delage, Abdallah Saffidine, Vincent Thomas

Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i. e., exploiting the fact that sub-problems are nested recursively within the original problem.

Decision Making

rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions

no code implementations NeurIPS 2018 Mathieu Fehr, Olivier Buffet, Vincent Thomas, Jilles Dibangoye

In this paper, we focus on POMDPs and ρ-POMDPs with λ ρ -Lipschitz reward function, and demonstrate that, for finite horizons, the optimal value function is Lipschitz-continuous.

A POMDP Extension with Belief-dependent Rewards

no code implementations NeurIPS 2010 Mauricio Araya, Olivier Buffet, Vincent Thomas, Françcois Charpillet

Partially Observable Markov Decision Processes (POMDPs) model sequential decision-making problems under uncertainty and partial observability.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.