1 code implementation • NeurIPS 2018 • Bernard Delyon, François Portier
Each stage $t$ is formed with two steps : (i) to explore the space with $n_t$ points according to $q_t$ and (ii) to exploit the current amount of information to update the sampling policy.