Search Results for author: William Overman

Found 1 papers, 0 papers with code

Online Resource Allocation in Episodic Markov Decision Processes

no code implementations • 18 May 2023 • Duksang Lee, William Overman, Dabeen Lee

For the observe-then-decide regime, we prove that the expected regret against the dynamic clairvoyant optimal policy is bounded by $\tilde O(\rho^{-1}{H^{3/2}}S\sqrt{AT})$ where $\rho\in(0, 1)$ is the budget parameter, $H$ is the length of the horizon, $S$ and $A$ are the numbers of states and actions, and $T$ is the number of episodes.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.