2 code implementations • 19 Feb 2024 • Anya Sims, Cong Lu, Yee Whye Teh
The prevailing theoretical understanding is that this can then be viewed as online reinforcement learning in an approximate dynamics model, and any remaining gap is therefore assumed to be due to the imperfect dynamics model.