Search Results for author: Aaron Sonabend-W

Found 3 papers, 1 papers with code

Semi-Supervised Off Policy Reinforcement Learning

no code implementations9 Dec 2020 Aaron Sonabend-W, Nilanjana Laha, Ashwin N. Ananthakrishnan, Tianxi Cai, Rajarshi Mukherjee

2) The surrogate variables we leverage in the modified SSL framework are predictive of the outcome but not informative to the optimal policy or value function.

Imputation Q-Learning +2

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

2 code implementations NeurIPS 2020 Aaron Sonabend-W, Junwei Lu, Leo A. Celi, Tianxi Cai, Peter Szolovits

However, the adoption of such policies in practice is often challenging, as they are hard to interpret within the application context, and lack measures of uncertainty for the learned policy value and its decisions.

reinforcement-learning Reinforcement Learning (RL) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.