no code implementations • 7 Feb 2019 • Nischal Agrawal, Prasanna Chaporkar
Multi-armed bandit(MAB) problem is a reinforcement learning framework where an agent tries to maximise her profit by proper selection of actions through absolute feedback for each action.