no code implementations • 5 Dec 2019 • Maor Gaon, Ronen I. Brafman
Past work considered the problem of modeling and solving MDPs with non-Markovian rewards (NMR), but we know of no principled approaches for RL with NMR.
Q-Learning reinforcement-learning +1