no code implementations • 29 Apr 2020 • Konstantin E. Avrachenkov, Vivek S. Borkar
A novel reinforcement learning algorithm is introduced for multiarmed restless bandits with average reward, using the paradigms of Q-learning and Whittle index.
Q-Learning reinforcement-learning +1