2 code implementations • ICLR 2021 • Aayam Shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern
We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience.