Generalized non-stationary bandits

1 Feb 2021  ·  Anne Gael Manegueu, Alexandra Carpentier, Yi Yu ·

In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here