1 code implementation • 13 Oct 2021 • Samuel Allen Alexander, Michael Castaneda, Kevin Compher, Oscar Martinez
We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior.