Search Results for author: Zixin Zhong

Found 9 papers, 4 papers with code

Stochastic Gradient Succeeds for Bandits

no code implementations • 27 Feb 2024 • Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans

We show that the \emph{stochastic gradient} bandit algorithm converges to a \emph{globally optimal} policy at an $O(1/t)$ rate, even with a \emph{constant} step size.

Paper
Add Code

Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits

1 code implementation • 31 Jan 2023 • Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

Under this constraint, we design and analyze an algorithm {\sc PASCombUCB} that minimizes the regret over the horizon of time $T$.

Recommendation Systems

Paper
Code

Fast Beam Alignment via Pure Exploration in Multi-armed Bandits

1 code implementation • 23 Oct 2022 • Yi Wei, Zixin Zhong, Vincent Y. F. Tan

The beam alignment (BA) problem consists in accurately aligning the transmitter and receiver beams to establish a reliable communication link in wireless communication systems.

Multi-Armed Bandits

Paper
Code

Optimal Clustering with Bandit Feedback

no code implementations • 9 Feb 2022 • Junwen Yang, Zixin Zhong, Vincent Y. F. Tan

This paper considers the problem of online clustering with bandit feedback.

Clustering Online Clustering

Paper
Add Code

Almost Optimal Variance-Constrained Best Arm Identification

1 code implementation • 25 Jan 2022 • Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

We design and analyze VA-LUCB, a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold.

Paper
Code

Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits

no code implementations • 16 Oct 2021 • Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

We study the Pareto frontier of two archetypal objectives in multi-armed bandits, namely, regret minimization (RM) and best arm identification (BAI) with a fixed horizon.

Multi-Armed Bandits

Paper
Add Code

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

1 code implementation • 15 Oct 2020 • Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

When the amount of corruptions per step (CPS) is below a threshold, PSS($u$) identifies the best arm or item with probability tending to $1$ as $T\rightarrow \infty$.

Paper
Code

Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting

no code implementations • ICML 2020 • Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

Finally, extensive numerical simulations corroborate the efficacy of CascadeBAI as well as the tightness of our upper bound on its time complexity.

Paper
Add Code

Thompson Sampling Algorithms for Cascading Bandits

no code implementations • 2 Oct 2018 • Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

While Thompson sampling (TS) algorithms have been shown to be empirically superior to Upper Confidence Bound (UCB) algorithms for cascading bandits, theoretical guarantees are only known for the latter.

Efficient Exploration Multi-Armed Bandits +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.