no code implementations • 18 Jun 2021 • James Cheshire, Pierre Ménard, Alexandra Carpentier
Taking $K$ as the number of arms, we consider the case where (i) the sequence of arm's means $(\mu_k)_{k=1}^K$ is monotonically increasing (MTBP) and (ii) the case where $(\mu_k)_{k=1}^K$ is concave (CTBP).
no code implementations • NeurIPS 2021 • Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier
We characterize the optimal learning rates both in the cumulative regret setting, and in the best-arm identification setting in terms of the problem parameters $T$ (the budget), $p^*$ and $\Delta$.
no code implementations • 17 Jun 2020 • James Cheshire, Pierre Menard, Alexandra Carpentier
We prove that the minimax rates for the regret are (i) $\sqrt{\log(K)K/T}$ for TBP, (ii) $\sqrt{\log(K)/T}$ for MTBP, (iii) $\sqrt{K/T}$ for UTBP and (iv) $\sqrt{\log\log K/T}$ for CTBP, where $K$ is the number of arms and $T$ is the budget.