no code implementations • NeurIPS 2021 • James Bell, Linda Linsefors, Caspar Oesterheld, Joar Skalse
This gives us a powerful tool for reasoning about the limit behaviour of agents -- for example, it lets us show that there are Newcomblike environments in which a reinforcement learning agent cannot converge to any optimal policy.