2 code implementations • 6 Feb 2022 • Christina Göpfert, Alex Haig, Yinlam Chow, Chih-Wei Hsu, Ivan Vendrov, Tyler Lu, Deepak Ramachandran, Hubert Pham, Mohammad Ghavamzadeh, Craig Boutilier
Interactive recommender systems have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional recommender systems (e. g., clicks, item consumption, ratings).
1 code implementation • ICML 2020 • Andy Su, Jayden Ooi, Tyler Lu, Dale Schuurmans, Craig Boutilier
Delusional bias is a fundamental source of error in approximate Q-learning.
no code implementations • 20 Nov 2019 • Ivan Vendrov, Tyler Lu, Qingqing Huang, Craig Boutilier
Effective techniques for eliciting user preferences have taken on added importance as recommender systems (RSs) become increasingly interactive and conversational.
1 code implementation • NeurIPS 2018 • Nevena Lazic, Craig Boutilier, Tyler Lu, Eehern Wong, Binz Roy, Mk Ryu, Greg Imwalle
Despite impressive recent advances in reinforcement learning (RL), its deployment in real-world physical systems is often complicated by unexpected events, limited data, and the potential for expensive failures.
no code implementations • NeurIPS 2018 • Tyler Lu, Dale Schuurmans, Craig Boutilier
We identify a fundamental source of error in Q-learning and other forms of dynamic programming with function approximation.
no code implementations • 30 Nov 2017 • Tyler Lu, Martin Zinkevich, Craig Boutilier, Binz Roy, Dale Schuurmans
Motivated by the cooling of Google's data centers, we study how one can safely identify the parameters of a system model with a desired accuracy and confidence level.