no code implementations • 22 Dec 2022 • Arun Verma
This thesis considers sequential decision problems, where the loss/reward incurred by selecting an action may not be inferred from observed feedback.
no code implementations • 22 Dec 2022 • Arun Verma
This thesis considers sequential decision problems, where the loss/reward incurred by selecting an action may not be inferred from observed feedback.
1 code implementation • 19 Jun 2022 • Arun Verma, Zhongxiang Dai, Bryan Kian Hsiang Low
The existing BO methods assume that the function evaluation (feedback) is available to the learner immediately or after a fixed delay.
1 code implementation • 28 May 2022 • Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian Hsiang Low, Patrick Jaillet
To better exploit the federated setting, FN-UCB adopts a weighted combination of two UCBs: $\text{UCB}^{a}$ allows every agent to additionally use the observations from the other agents to accelerate exploration (without sharing raw observations), while $\text{UCB}^{b}$ uses an NN with aggregated parameters for reward prediction in a similar way to federated averaging for supervised learning.
1 code implementation • 6 May 2022 • Srijith Balakrishnan, Beatrice Cassottana, Arun Verma
Finally, ML algorithms are used to develop models that predict the network-wide impacts of disruptive events using the cluster-level features.
no code implementations • NeurIPS 2021 • Arun Verma, Manjesh K. Hanawal
This paper studies a new variant of the stochastic multi-armed bandits problem where auxiliary information about the arm rewards is available in the form of control variates.
no code implementations • 12 Apr 2021 • Arun Verma, Manjesh K. Hanawal, Arun Rajkumar, Raman Sankaran
The loss depends on two hidden parameters, one specific to the arm but independent of the resource allocation, and the other depends on the allocated resource.
no code implementations • NeurIPS 2020 • Arun Verma, Manjesh K. Hanawal, Csaba Szepesvári, Venkatesh Saligrama
In this paper, we study Contextual Unsupervised Sequential Selection (USS), a new variant of the stochastic contextual bandits problem where the loss of an arm cannot be inferred from the observed feedback.
no code implementations • 16 Sep 2020 • Arun Verma, Manjesh K. Hanawal, Nandyala Hemachandra
The total loss is the sum of the cost incurred for selecting the arm and the stochastic loss associated with the selected arm.
no code implementations • 17 Jun 2020 • Arun Verma, Manjesh K. Hanawal
We model this problem setting as a bandit setting where feedback obtained in each round depends on the resource allocated to the agents.
no code implementations • 13 Mar 2020 • Debamita Ghosh, Arun Verma, Manjesh K. Hanawal
It is thus important to learn the least amount of energy harvested by nodes so that the source can transmit on a frequency band that maximizes this amount.
no code implementations • 25 Dec 2019 • Arun Verma, Manjesh K. Hanawal, Nandyala Hemachandra
In medical diagnosis, physicians predict the state of a patient by checking measurements (features) obtained from a sequence of tests, e. g., blood test, urine test, followed by invasive tests.
1 code implementation • NeurIPS 2019 • Arun Verma, Manjesh K. Hanawal, Arun Rajkumar, Raman Sankaran
We study this novel setting by establishing its `equivalence' to Multiple-Play Multi-Armed Bandits(MP-MAB) and Combinatorial Semi-Bandits.
no code implementations • 15 Jan 2019 • Arun Verma, Manjesh K. Hanawal, Csaba Szepesvári, Venkatesh Saligrama
We set up the USS problem as a stochastic partial monitoring problem and develop an algorithm with sub-linear regret under the WD property.