Multi-Armed Bandits

195 papers with code • 1 benchmarks • 2 datasets

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Benchmarks

Add a Result

These leaderboards are used to track progress in Multi-Armed Bandits

Trend	Dataset	Best Model	Paper	Code	Compare
	Mushroom	Linear FullPosterior-MR			See all

Libraries

Use these libraries to find Multi-Armed Bandits models and implementations

facebookresearch/Horizon

2 papers

3,521

facebookresearch/ReAgent

2 papers

3,521

st-tech/zr-obp

2 papers

612

Datasets

Most implemented papers

Most implemented Social Latest No code

Correlated Multi-armed Bandits with a Latent Random Source

shreyasc-13/correlated_bandits • 17 Aug 2018

As a result, there are regimes where our algorithm achieves a $\mathcal{O}(1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms.

Paper
Code

Adapting multi-armed bandits policies to contextual bandits scenarios

david-cortes/contextualbandits • 11 Nov 2018

This work explores adaptations of successful multi-armed bandits policies to the online contextual bandits scenario with binary rewards using binary classification algorithms such as logistic regression as black-box oracles.

Paper
Code

Bayesian Optimisation over Multiple Continuous and Categorical Inputs

rubinxin/CoCaBO_code • ICML 2020

Efficient optimisation of black-box problems that comprise both continuous and categorical inputs is important, yet poses significant challenges.

Paper
Code

Multi-Armed Bandits with Correlated Arms

shreyasc-13/correlated_bandits • 6 Nov 2019

We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated.

Paper
Code

The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

khashayarkhv/many-armed-bandit • 24 Feb 2020

This finding diverges from the notion of free exploration, which relates to covariate variation, as recently discussed in contextual bandit literature.

Paper
Code

Gaussian Gated Linear Networks

deepmind/deepmind-research • • NeurIPS 2020

We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks.

Paper
Code

BanditPAM: Almost Linear Time $k$-Medoids Clustering via Multi-Armed Bandits

motiwari/BanditPAM-python • 11 Jun 2020

Current state-of-the-art $k$-medoids clustering algorithms, such as Partitioning Around Medoids (PAM), are iterative and are quadratic in the dataset size $n$ for each iteration, being prohibitively expensive for large datasets.

Paper
Code

Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

lily-x/dual-mandate • 14 Sep 2020

Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i. e., patrollers), who must patrol vast areas to protect from attackers (e. g., poachers or illegal loggers).

Paper
Code