3 code implementations • ICLR 2021 • Simon Ramstedt, Yann Bouteiller, Giovanni Beltrame, Christopher Pal, Jonathan Binas
Action and observation delays commonly occur in many Reinforcement Learning applications, such as remote control scenarios.
3 code implementations • NeurIPS 2019 • Simon Ramstedt, Christopher Pal
Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action selection.