no code implementations • 15 Jan 2014 • Eyal Amir, Allen Chang
Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations.
Partially Observable Reinforcement Learning reinforcement-learning +1