Multi-Armed Bandits

195 papers with code • 1 benchmarks • 2 datasets

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Libraries

Use these libraries to find Multi-Armed Bandits models and implementations

Most implemented papers

Correlated Multi-armed Bandits with a Latent Random Source

shreyasc-13/correlated_bandits 17 Aug 2018

As a result, there are regimes where our algorithm achieves a $\mathcal{O}(1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms.

Adapting multi-armed bandits policies to contextual bandits scenarios

david-cortes/contextualbandits 11 Nov 2018

This work explores adaptations of successful multi-armed bandits policies to the online contextual bandits scenario with binary rewards using binary classification algorithms such as logistic regression as black-box oracles.

Bayesian Optimisation over Multiple Continuous and Categorical Inputs

rubinxin/CoCaBO_code ICML 2020

Efficient optimisation of black-box problems that comprise both continuous and categorical inputs is important, yet poses significant challenges.

Multi-Armed Bandits with Correlated Arms

shreyasc-13/correlated_bandits 6 Nov 2019

We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated.

The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

khashayarkhv/many-armed-bandit 24 Feb 2020

This finding diverges from the notion of free exploration, which relates to covariate variation, as recently discussed in contextual bandit literature.

Gaussian Gated Linear Networks

deepmind/deepmind-research NeurIPS 2020

We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks.

BanditPAM: Almost Linear Time $k$-Medoids Clustering via Multi-Armed Bandits

motiwari/BanditPAM-python 11 Jun 2020

Current state-of-the-art $k$-medoids clustering algorithms, such as Partitioning Around Medoids (PAM), are iterative and are quadratic in the dataset size $n$ for each iteration, being prohibitively expensive for large datasets.

Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

lily-x/dual-mandate 14 Sep 2020

Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i. e., patrollers), who must patrol vast areas to protect from attackers (e. g., poachers or illegal loggers).

Neural Thompson Sampling

ZeroWeight/NeuralTS ICLR 2021

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-armed bandit problems.

Quantile Bandits for Best Arms Identification

Mengyanz/QSAR 22 Oct 2020

We consider a variant of the best arm identification task in stochastic multi-armed bandits.