Search Results for author: Sumeet Katariya

Found 21 papers, 10 papers with code

Finite-Time Logarithmic Bayes Regret Upper Bounds

no code implementations15 Jun 2023 Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, Sujay Sanghavi

In a multi-armed bandit, we obtain $O(c_\Delta \log n)$ and $O(c_h \log^2 n)$ upper bounds for an upper confidence bound algorithm, where $c_h$ and $c_\Delta$ are constants depending on the prior distribution and the gaps of bandit instances sampled from it, respectively.

Selective Uncertainty Propagation in Offline RL

no code implementations1 Feb 2023 Sanath Kumar Krishnamurthy, Shrey Modi, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Anshuka Rangi

We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms.

Offline RL reinforcement-learning +1

Multi-Task Off-Policy Learning from Bandit Feedback

no code implementations9 Dec 2022 Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

We prove per-task bounds on the suboptimality of the learned policies, which show a clear improvement over not using the hierarchical model.

Learning-To-Rank Recommendation Systems

Bayesian Fixed-Budget Best-Arm Identification

no code implementations15 Nov 2022 Alexia Atsidakou, Sumeet Katariya, Sujay Sanghavi, Branislav Kveton

We also provide a lower bound on the probability of misidentification in a $2$-armed Bayesian bandit and show that our upper bound (almost) matches it for any budget.

Mixed-Effect Thompson Sampling

1 code implementation30 May 2022 Imad Aouali, Branislav Kveton, Sumeet Katariya

The regret bound has two terms, one for learning the action parameters and the other for learning the shared effect parameters.

Thompson Sampling

Meta-Learning for Simple Regret Minimization

1 code implementation25 Feb 2022 MohammadJavad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, Sumeet Katariya

The Bayesian algorithm has access to a prior distribution over the meta-parameters and its meta simple regret over $m$ bandit tasks with horizon $n$ is mere $\tilde{O}(m / \sqrt{n})$.

Meta-Learning

Task-Agnostic Graph Explanations

1 code implementation16 Feb 2022 Yaochen Xie, Sumeet Katariya, Xianfeng Tang, Edward Huang, Nikhil Rao, Karthik Subbian, Shuiwang Ji

They are also unable to provide explanations in cases where the GNN is trained in a self-supervised manner, and the resulting representations are used in future downstream tasks.

Deep Hierarchy in Bandits

no code implementations3 Feb 2022 Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits.

Thompson Sampling

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

1 code implementation NeurIPS 2021 Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, Chandan K. Reddy

Current approaches employ spatial geometries such as boxes to learn query representations that encompass the answer entities and model the logical operations of projection and intersection.

Knowledge Graph Embedding Knowledge Graphs +1

Task-Agnostic Graph Neural Explanations

no code implementations29 Sep 2021 Yaochen Xie, Sumeet Katariya, Xianfeng Tang, Edward W Huang, Nikhil Rao, Karthik Subbian, Shuiwang Ji

TAGE enables the explanation of GNN embedding models without downstream tasks and allows efficient explanation of multitask models.

Pure Exploration with Structured Preference Feedback

no code implementations12 Apr 2021 Shubham Gupta, Aadirupa Saha, Sumeet Katariya

We consider the problem of pure exploration with subset-wise preference feedback, which contains $N$ arms with features.

Decision Making

Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs

1 code implementation23 Dec 2020 Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, Chandan K. Reddy

Promising approaches to tackle this problem include embedding the KG units (e. g., entities and relations) in a Euclidean space such that the query embedding contains the information relevant to its results.

Anomaly Detection Knowledge Graphs +2

Robust Outlier Arm Identification

1 code implementation ICML 2020 Yinglun Zhu, Sumeet Katariya, Robert Nowak

We study the problem of Robust Outlier Arm Identification (ROAI), where the goal is to identify arms whose expected rewards deviate substantially from the majority, by adaptively sampling from their reward distributions.

Outlier Detection

MaxGap Bandit: Adaptive Algorithms for Approximate Ranking

1 code implementation NeurIPS 2019 Sumeet Katariya, Ardhendu Tripathy, Robert Nowak

This paper studies the problem of adaptively sampling from K distributions (arms) in order to identify the largest gap between any two adjacent means.

Outlier Detection

Conservative Exploration using Interleaving

no code implementations3 Jun 2018 Sumeet Katariya, Branislav Kveton, Zheng Wen, Vamsi K. Potluru

In many practical problems, a learning agent may want to learn the best action in hindsight without ever taking a bad action, which is significantly worse than the default production action.

Adaptive Sampling for Coarse Ranking

1 code implementation20 Feb 2018 Sumeet Katariya, Lalit Jain, Nandana Sengupta, James Evans, Robert Nowak

We consider the problem of active coarse ranking, where the goal is to sort items according to their means into clusters of pre-specified sizes, by adaptively sampling from their reward distributions.

Bernoulli Rank-$1$ Bandits for Click Feedback

no code implementations19 Mar 2017 Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen

The probability that a user will click a search result depends both on its relevance and its position on the results page.

Position

Stochastic Rank-1 Bandits

no code implementations10 Aug 2016 Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Claire Vernade, Zheng Wen

The main challenge of the problem is that the individual values of the row and column are unobserved.

DCM Bandits: Learning to Rank with Multiple Clicks

1 code implementation9 Feb 2016 Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Zheng Wen

This work presents the first practical and regret-optimal online algorithm for learning to rank with multiple clicks in a cascade-like click model.

Learning-To-Rank

Sparse Dueling Bandits

no code implementations31 Jan 2015 Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse squared gaps between the Borda scores of each suboptimal arm and the best arm.

Cannot find the paper you are looking for? You can Submit a new open access paper.