1 code implementation • 17 Sep 2021 • Zac Wellmer, James T. Kwok
By training the World Model using dropout, the dream environment is capable of creating a nearly infinite number of different dream environments.
no code implementations • 25 Sep 2019 • Zac Wellmer, Sepanta Zeighami, James Kwok
However, decision-time planning with implicit dynamics models in continuous action space has proven to be a difficult problem.
Model-based Reinforcement Learning Policy Gradient Methods +3
no code implementations • 15 Sep 2019 • Zac Wellmer, James Kwok
This paper proposes a novel deep reinforcement learning architecture that was inspired by previous tree structured architectures which were only useable in discrete action spaces.