no code implementations • 26 Feb 2024 • Carlos G. Correa, Thomas L. Griffiths, Nathaniel D. Daw
Typical models of learning assume incremental estimation of continuously-varying decision variables like expected rewards.
2 code implementations • 30 Nov 2023 • Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths
We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL.
no code implementations • 7 Nov 2022 • Carlos G. Correa, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths
Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions.
1 code implementation • 23 May 2022 • Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, Thomas L. Griffiths
Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key.
1 code implementation • 4 Apr 2022 • Sreejan Kumar, Ishita Dasgupta, Nathaniel D. Daw, Jonathan D. Cohen, Thomas L. Griffiths
However, because neural networks are hard to interpret, it can be difficult to tell whether agents have learned the underlying abstraction, or alternatively statistical patterns that are characteristic of that abstraction.
1 code implementation • ICLR 2021 • Sreejan Kumar, Ishita Dasgupta, Jonathan D. Cohen, Nathaniel D. Daw, Thomas L. Griffiths
We then introduce a novel approach to constructing a "null task distribution" with the same statistical complexity as this structured task distribution but without the explicit rule-based structure used to generate the structured task.
no code implementations • NeurIPS 2011 • Dylan A. Simon, Nathaniel D. Daw
There is much evidence that humans and other animals utilize a combination of model-based and model-free RL methods.