no code implementations • 21 Dec 2022 • Asaf Cohen, Vijay G. Subramanian, Yili Zhang
We show that the algorithm achieves an $O(1)$ regret when all optimal thresholds with full information are non-zero, and achieves an $O(\ln^{1+\epsilon}(N))$ regret for any specified $\epsilon>0$, in the case that an optimal threshold with full information is $0$ (i. e., an optimal policy is to reject all arrivals), where $N$ is the number of arrivals.
no code implementations • 6 Feb 2019 • Erhan Bayraktar, Ibrahim Ekren, Yili Zhang
For the problem of prediction with expert advice in the adversarial setting with geometric stopping, we compute the exact leading order expansion for the long time behavior of the value function.