1 code implementation • 10 Dec 2020 • Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric-Ambryn Maillard
In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward distribution.