1 code implementation • 21 Feb 2024 • Lucas Lehnert, Sainbayar Sukhbaatar, DiJia Su, Qinqing Zheng, Paul McVay, Michael Rabbat, Yuandong Tian
We fine tune this model to obtain a Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93. 7% of the time, while using up to 26. 8% fewer search steps than the $A^*$ implementation that was used for training initially.
no code implementations • 1 Jun 2023 • Rohan Chitnis, Yingchen Xu, Bobak Hashemi, Lucas Lehnert, Urun Dogan, Zheqing Zhu, Olivier Delalleau
Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset.
no code implementations • 7 Nov 2022 • Lucas Lehnert, Michael J. Frank, Michael L. Littman
Recent advances in reinforcement-learning research have demonstrated impressive results in building algorithms that can out-perform humans in complex tasks.
no code implementations • 31 Jan 2019 • Lucas Lehnert, Michael L. Littman
A key question in reinforcement learning is how an intelligent agent can generalize knowledge across different inputs.
no code implementations • 3 Dec 2018 • Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman
An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.
no code implementations • 4 Jul 2018 • Lucas Lehnert, Michael L. Littman
Further, we present a Successor Feature model which shows that learning Successor Features is equivalent to learning a Model-Reduction.
no code implementations • ICML 2018 • David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman
We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks.
no code implementations • 31 Jul 2017 • Lucas Lehnert, Stefanie Tellex, Michael L. Littman
One question central to Reinforcement Learning is how to learn a feature representation that supports algorithm scaling and re-use of learned information from different tasks.
no code implementations • 13 Dec 2015 • Lucas Lehnert, Doina Precup
Off-policy learning refers to the problem of learning the value function of a way of behaving, or policy, while following a different policy.