no code implementations • 17 Apr 2024 • Salomé Lepers, Sophie Lemonnier, Vincent Thomas, Olivier Buffet
This paper looks at predictability problems, i. e., wherein an agent must choose its strategy in order to optimize the predictions that an external observer could make.
no code implementations • 19 May 2023 • Yang You, Vincent Thomas, Francis Colas, Olivier Buffet
Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability.
no code implementations • 27 Feb 2023 • Yang You, Vincent Thomas, Francis Colas, Rachid Alami, Olivier Buffet
Based on this, we propose two contributions: 1) an approach to automatically generate an uncertain human behavior (a policy) for each given objective function while accounting for possible robot behaviors; and 2) a robot planning algorithm that is robust to the above-mentioned uncertainties and relies on solving a partially observable Markov decision process (POMDP) obtained by reasoning on a distribution over human behaviors.
no code implementations • 17 Sep 2021 • Yang You, Vincent Thomas, Francis Colas, Olivier Buffet
This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i. e., situations where each agent's policy is a best response to the other agents' (fixed) policies.
no code implementations • 21 Mar 2021 • Vincent Thomas, Gérémy Hutin, Olivier Buffet
In this article, we discuss how to solve information-gathering problems expressed as rho-POMDPs, an extension of Partially Observable Markov Decision Processes (POMDPs) whose reward rho depends on the belief state.
no code implementations • 29 Jun 2020 • Olivier Buffet, Jilles Dibangoye, Aurélien Delage, Abdallah Saffidine, Vincent Thomas
Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i. e., exploiting the fact that sub-problems are nested recursively within the original problem.
no code implementations • NeurIPS 2018 • Mathieu Fehr, Olivier Buffet, Vincent Thomas, Jilles Dibangoye
In this paper, we focus on POMDPs and ρ-POMDPs with λ ρ -Lipschitz reward function, and demonstrate that, for finite horizons, the optimal value function is Lipschitz-continuous.
no code implementations • NeurIPS 2010 • Mauricio Araya, Olivier Buffet, Vincent Thomas, Françcois Charpillet
Partially Observable Markov Decision Processes (POMDPs) model sequential decision-making problems under uncertainty and partial observability.