1 code implementation • NeurIPS 2013 • Nan Ye, Adhiraj Somani, David Hsu, Wee Sun Lee
We show that the best policy obtained from a DESPOT is near-optimal, with a regret bound that depends on the representation size of the optimal policy.
Autonomous Driving