no code implementations • ICLR Workshop Learning_to_Learn 2021 • Michael Chang, Sidhant Kaushik, Sergey Levine, Thomas L. Griffiths
Empirical evidence suggests that such action-value methods are more sample efficient than policy-gradient methods on transfer problems that require only sparse changes to a sequence of previously optimal decisions.
no code implementations • 5 Jul 2020 • Michael Chang, Sidhant Kaushik, S. Matthew Weinberg, Thomas L. Griffiths, Sergey Levine
This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems.