no code implementations • 17 Dec 2023 • Longchao Da, Porter Jenkins, Trevor Schwantes, Jeffrey Dotson, Hua Wei
In this paper, we present Probabilistic Offline Policy Ranking (POPR), a framework to address OPR problems by leveraging expert data to characterize the probability of a candidate policy behaving like experts, and approximating its entire performance posterior distribution to help with ranking.
no code implementations • 9 Jan 2020 • Porter Jenkins, Hua Wei, J. Stockton Jenkins, Zhenhui Li
Moreover, learning important spatial patterns in offline retail is challenging due to the scarcity of data and the high cost of exploration and experimentation in the physical world.