Search Results for author: Kyle Wray

Found 3 papers, 2 papers with code

Entropy-regularized Point-based Value Iteration

1 code implementation • 14 Feb 2024 • Harrison Delecki, Marcell Vazquez-Chanlatte, Esen Yel, Kyle Wray, Tomer Arnon, Stefan Witwicki, Mykel J. Kochenderfer

However, model-based planners may be brittle under these types of uncertainty because they rely on an exact model and tend to commit to a single optimal behavior.

Paper
Code

Decision Making in Non-Stationary Environments with Policy-Augmented Search

1 code implementation • 6 Jan 2024 • Ava Pettet, Yunuo Zhang, Baiting Luo, Kyle Wray, Hendrik Baier, Aron Laszka, Abhishek Dubey, Ayan Mukhopadhyay

In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment.

Decision Making Decision Making Under Uncertainty +2

Paper
Code

Active teacher selection for reinforcement learning from human feedback

no code implementations • 23 Oct 2023 • Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell

The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling.

Recommendation Systems reinforcement-learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.