Search Results for author: Arpan Mukherjee

Found 7 papers, 0 papers with code

Robust Causal Bandits for Linear Models

no code implementations • 30 Oct 2023 • Zirui Yan, Arpan Mukherjee, Burak Varici, Ali Tajer

Cumulative regret is adopted as the design criteria, based on which the objective is to design a sequence of interventions that incur the smallest cumulative regret with respect to an oracle aware of the entire causal model and its fluctuations.

Paper
Add Code

Optimal Best Arm Identification with Fixed Confidence in Restless Bandits

no code implementations • 20 Oct 2023 • P. N. Karthik, Vincent Y. F. Tan, Arpan Mukherjee, Ali Tajer

It is shown that under every policy, the state-action visitation proportions satisfy a specific approximate flow conservation constraint and that these proportions match the optimal proportions dictated by the lower bound under any asymptotically optimal policy.

Paper
Add Code

Best Arm Identification in Stochastic Bandits: Beyond $β-$optimality

no code implementations • 10 Jan 2023 • Arpan Mukherjee, Ali Tajer

Two key metrics for assessing bandit algorithms are computational efficiency and performance optimality (e. g., in sample complexity).

Computational Efficiency Multi-Armed Bandits

Paper
Add Code

Active Sampling of Multiple Sources for Sequential Estimation

no code implementations • 10 Aug 2022 • Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Additionally, each process $i\in\{1, \dots, K\}$ has a private parameter $\alpha_i$.

Paper
Add Code

SPRT-based Efficient Best Arm Identification in Stochastic Bandits

no code implementations • 22 Jul 2022 • Arpan Mukherjee, Ali Tajer

Based on this test statistic, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests for arm selection and is amenable to tractable analysis for the exponential family of bandits.

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

Best Arm Identification in Contaminated Stochastic Bandits

no code implementations • NeurIPS 2021 • Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Owing to the adversarial contamination of the rewards, each arm's mean is only partially identifiable.

Paper
Add Code

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

no code implementations • NeurIPS 2021 • Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Owing to the adversarial contamination of the rewards, each arm's mean is only partially identifiable.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.