no code implementations • 27 Jan 2023 • Lingwei Zhu, Zheng Chen, Matthew Schlegel, Martha White
Many policy optimization approaches in reinforcement learning incorporate a Kullback-Leilbler (KL) divergence to the previous policy, to prevent the policy from changing too quickly.
no code implementations • NeurIPS 2021 • Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White
Learning auxiliary tasks, such as multiple predictions about the world, can provide many benefits to reinforcement learning systems.
no code implementations • NeurIPS 2021 • Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James Kostas, Philip S. Thomas, Martha White
In this work, we revisit this approach and investigate if we can leverage other reinforcement learning approaches to improve learning.
no code implementations • 17 Jul 2019 • Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White
This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems.
2 code implementations • NeurIPS 2019 • Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White
Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning.
no code implementations • NeurIPS 2018 • Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White
Directed exploration strategies for reinforcement learning are critical for learning an optimal policy in a minimal number of interactions with the environment.
no code implementations • 27 Sep 2018 • Matthew Schlegel, Wesley Chung, Daniel Graves, Martha White
We propose Importance Resampling (IR) for off-policy learning, that resamples experience from the replay buffer and applies a standard on-policy update.
no code implementations • 18 Jul 2018 • Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White
A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation.
no code implementations • ICLR 2018 • Matthew Schlegel, Andrew Patterson, Adam White, Martha White
We investigate a framework for discovery: curating a large collection of predictions, which are used to construct the agent's representation of the world.
no code implementations • ICML 2017 • Matthew Schlegel, Yangchen Pan, Jiecao Chen, Martha White
In this work, we develop an approximately submodular criterion for this setting, and an efficient online greedy submodular maximization algorithm for optimizing the criterion.