no code implementations • 10 Apr 2018 • Alex Kearney, Vivek Veeriah, Jaden B. Travnik, Richard S. Sutton, Patrick M. Pilarski
In this paper, we introduce a method for adapting the step-sizes of temporal difference (TD) learning.
no code implementations • 16 Feb 2018 • Jaden B. Travnik, Kory W. Mathewson, Richard S. Sutton, Patrick M. Pilarski
The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored.