no code implementations • 20 Feb 2024 • Nikola Pavlovic, Sudeep Salgia, Qing Zhao
Agents can share information through a central server, with the objective of minimizing regret that is accumulating over time $T$ and aggregating over agents.
no code implementations • 23 Oct 2023 • Sudeep Salgia, Sattar Vakili, Qing Zhao
We consider Bayesian optimization using Gaussian Process models, also referred to as kernel-based bandit optimization.
no code implementations • 21 Jan 2023 • Sudeep Salgia, Qing Zhao, Tamir Gabay, Kobi Cohen
We develop a distributed online learning algorithm that achieves order-optimal cumulative regret with low communication cost measured in the total number of bits transmitted over the entire learning horizon.
no code implementations • 4 Nov 2022 • Sudeep Salgia, Qing Zhao
We consider distributed linear bandits where $M$ agents learn collaboratively to minimize the overall cumulative regret incurred by all agents.
no code implementations • 16 Jul 2022 • Sudeep Salgia, Sattar Vakili, Qing Zhao
We study collaborative learning among distributed clients facilitated by a central server.
no code implementations • 31 May 2022 • Sudeep Salgia, Sattar Vakili, Qing Zhao
The non-asymptotic error bounds may be of broader interest as a tool to establish the relation between the smoothness of the activation functions in neural contextual bandits and the smoothness of the kernels in kernel bandits.
no code implementations • 12 Oct 2021 • Sudeep Salgia, Qing Zhao, Lang Tong
The alternative hypothesis is a composite set of Lipschitz continuous distributions that are at least $\varepsilon$ away in $\ell_1$ distance from the uniform distribution.
1 code implementation • NeurIPS 2021 • Sudeep Salgia, Sattar Vakili, Qing Zhao
We consider sequential optimization of an unknown function in a reproducing kernel Hilbert space.
no code implementations • ICML 2020 • Sudeep Salgia, Qing Zhao, Sattar Vakili
A framework based on iterative coordinate minimization (CM) is developed for stochastic convex optimization.
no code implementations • 19 Apr 2019 • Boshuang Huang, Sudeep Salgia, Qing Zhao
We show that the proposed algorithm has a label complexity of $O(dT^{\frac{2-2\alpha}{2-\alpha}}\log^2 T)$ under a constraint of bounded regret in terms of classification errors, where $d$ is the VC dimension of the hypothesis space and $\alpha$ is the Tsybakov noise parameter.
no code implementations • 17 Jan 2019 • Sattar Vakili, Sudeep Salgia, Qing Zhao
Online minimization of an unknown convex function over the interval $[0, 1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point.