no code implementations • 20 Feb 2024 • Anand Kalvit, Aleksandrs Slivkins, Yonatan Gur
We study "incentivized exploration" (IE) in social learning problems where the principal (a recommendation algorithm) can leverage information asymmetry to incentivize sequentially-arriving agents to take exploratory actions.
no code implementations • 18 Jan 2023 • Anand Kalvit, Assaf Zeevi
We also show that the instance-independent (minimax) regret is $\tilde{\mathcal{O}}\left( \sqrt{n} \right)$ when $K=2$.
no code implementations • 23 Oct 2021 • Anand Kalvit, Assaf Zeevi
We consider a bandit problem where at any time, the decision maker can add new arms to her consideration set.
no code implementations • NeurIPS 2021 • Anand Kalvit, Assaf Zeevi
One of the key drivers of complexity in the classical (stochastic) multi-armed bandit (MAB) problem is the difference between mean rewards in the top two arms, also known as the instance gap.
no code implementations • NeurIPS 2020 • Anand Kalvit, Assaf Zeevi
We consider a stochastic bandit problem with countably many arms that belong to a finite set of types, each characterized by a unique mean reward.