no code implementations • ICML 2020 • Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang
A general framework for online learning with partial information is one where feedback graphs specify which losses can be observed by the learner.
no code implementations • 7 Jun 2023 • Francesco Bonchi, Claudio Gentile, Francesco Paolo Nerini, André Panisson, Fabio Vitale
We present a new effective and scalable framework for training GNNs in node classification tasks, based on the effective resistance, a powerful tool solidly rooted in graph theory.
no code implementations • 5 Jun 2023 • Aldo Pacchiano, Christoph Dann, Claudio Gentile
We consider model selection for sequential decision making in stochastic environments with bandit feedback, where a meta-learner has at its disposal a pool of base learners, and decides on the fly which action to take based on the policies recommended by each base learner.
no code implementations • NeurIPS 2023 • Guanghui Wang, Zihao Hu, Claudio Gentile, Vidya Muthukumar, Jacob Abernethy
To address this limitation, we present a series of state-of-the-art implicit bias rates for mirror descent and steepest descent algorithms.
no code implementations • 11 Feb 2023 • Stephen Pasteris, Fabio Vitale, Mark Herbster, Claudio Gentile, Andre' Panisson
We investigate the problem of online collaborative filtering under no-repetition constraints, whereby users need to be served content in an online fashion and a given user cannot be recommended the same content item more than once.
no code implementations • 7 Feb 2023 • Alekh Agarwal, Claudio Gentile, Teodor V. Marinov
We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context.
no code implementations • 29 Nov 2022 • Sohan Rudra, Saksham Goel, Anirban Santara, Claudio Gentile, Laurent Perron, Fei Xia, Vikas Sindhwani, Carolina Parada, Gaurav Aggarwal
Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object.
no code implementations • 29 Jun 2022 • Aldo Pacchiano, Christoph Dann, Claudio Gentile
We study the problem of model selection in bandit scenarios in the presence of nested policy classes, with the goal of obtaining simultaneous adversarial and stochastic ("best of both worlds") high-probability regret guarantees.
no code implementations • 11 Feb 2022 • Claudio Gentile, Zhilei Wang, Tong Zhang
We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle.
no code implementations • 6 Dec 2021 • Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Claudio Gentile, Yishay Mansour
We investigate a nonstochastic bandit setting in which the loss of an action is not immediately charged to the player, but rather spread over the subsequent rounds in an adversarial way.
no code implementations • NeurIPS 2021 • Giulia Desalvo, Claudio Gentile, Tobias Sommer Thune
We derive a novel active learning algorithm in the streaming setting for binary classification tasks.
1 code implementation • NeurIPS 2021 • Gui Citovsky, Giulia Desalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar
The ability to train complex and highly effective models often requires an abundance of training data, which can easily become a bottleneck in cost, time, and computational resources.
no code implementations • NeurIPS 2020 • Dylan J. Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert
Given access to an online oracle for square loss regression, our algorithm attains optimal regret and -- in particular -- optimal dependence on the misspecification level, with no prior knowledge.
no code implementations • 7 Jun 2021 • Anirban Santara, Claudio Gentile, Gaurav Aggarwal, Shuai Li
Motivated by problems of learning to rank long item sequences, we introduce a variant of the cascading bandit model that considers flexible length sequences with varying rewards and losses.
no code implementations • NeurIPS 2021 • Pranjal Awasthi, Christoph Dann, Claudio Gentile, Ayush Sekhari, Zhilei Wang
We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever.
no code implementations • 24 Dec 2020 • Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett
Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment, rather than a stochastic one.
no code implementations • 7 Dec 2020 • Leonardo Cella, Claudio Gentile, Massimiliano Pontil
Unlike known model selection efforts in the recent bandit literature, our algorithm exploits the specific structure of the problem to learn the unknown parameters of the expected loss function so as to identify the best arm as quickly as possible.
no code implementations • ICML 2020 • Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang
We present a new active learning algorithm that adaptively partitions the input space into a finite number of regions, and subsequently seeks a distinct predictor for each region, both phases actively requesting labels.
no code implementations • NeurIPS 2019 • Fabio Vitale, Anand Rajagopalan, Claudio Gentile
We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures.
no code implementations • NeurIPS 2018 • Fabio Vitale, Nikos Parotsidis, Claudio Gentile
A reciprocal recommendation problem is one where the goal of learning is not just to predict a user's preference towards a passive item (e. g., a book), but to recommend the targeted user on one side another user from the other side such that a mutual interest between the two exists.
no code implementations • 19 Jun 2017 • Stephen Pasteris, Fabio Vitale, Claudio Gentile, Mark Herbster
We measure performance not based on the recovery of the hidden similarity function, but instead on how well we classify each item.
no code implementations • NeurIPS 2017 • Nicolò Cesa-Bianchi, Claudio Gentile, Gábor Lugosi, Gergely Neu
Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL).
no code implementations • ICML 2018 • Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Scott Yang
In the stochastic setting, we first point out a bias problem that limits the straightforward extension of algorithms such as UCB-N to time-varying feedback graphs, as needed in this context.
no code implementations • 27 Feb 2017 • Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information.
no code implementations • ICML 2017 • Claudio Gentile, Shuai Li, Purushottam Kar, Alexandros Karatzoglou, Evans Etrue, Giovanni Zappella
We investigate a novel cluster-of-bandit algorithm CAB for collaborative recommendation tasks that implements the underlying feedback sharing mechanism by estimating the neighborhood of users in a context-dependent manner.
no code implementations • 1 Jun 2016 • Géraud Le Falher, Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale
In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i. e., the positive or negative nature of the social relationships).
no code implementations • 2 May 2016 • Shuai Li, Claudio Gentile, Alexandros Karatzoglou
We investigate an efficient context-dependent clustering technique for recommender systems based on exploration-exploitation strategies through multi-armed bandits over multiple users.
no code implementations • 15 Feb 2016 • Nicolo' Cesa-Bianchi, Claudio Gentile, Yishay Mansour, Alberto Minora
We introduce \textsc{Exp3-Coop}, a cooperative version of the {\sc Exp3} algorithm and prove that with $K$ actions and $N$ agents the average per-agent regret after $T$ rounds is at most of order $\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}$, where $\alpha_{\le d}$ is the independence number of the $d$-th power of the connected communication graph $G$.
no code implementations • 11 Feb 2015 • Shuai Li, Alexandros Karatzoglou, Claudio Gentile
Our algorithm takes into account the collaborative effects that arise due to the interaction of the users with the items, by dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users.
no code implementations • 30 Sep 2014 • Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Shie Mannor, Yishay Mansour, Ohad Shamir
This naturally models several situations where the losses of different actions are related, and knowing the loss of one action provides information on the loss of other actions.
no code implementations • 31 Jan 2014 • Claudio Gentile, Shuai Li, Giovanni Zappella
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation ("bandit") strategies.
no code implementations • NeurIPS 2013 • Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour
We consider the partial observability model for multi-armed bandits, introduced by Mannor and Shamir.
no code implementations • NeurIPS 2013 • Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella
Multi-armed bandit problems are receiving a great deal of attention because they adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems.
no code implementations • NeurIPS 2012 • Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella
We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph $G = (V, E)$ such that $|E|$ is at least order of $|V|^{3/2}$ by querying at most order of $|V|^{3/2}$ edge labels.
no code implementations • NeurIPS 2012 • Claudio Gentile, Francesco Orabona
We present a novel multilabel/ranking algorithm working in partial information settings.
no code implementations • NeurIPS 2011 • Fabio Vitale, Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella
Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet.
no code implementations • NeurIPS 2008 • Giovanni Cavallanti, Nicolò Cesa-Bianchi, Claudio Gentile
Using the so-called Tsybakov low noise condition to parametrize the instance distribution, we show bounds on the convergence rate to the Bayes risk of both the fully supervised and the selective sampling versions of the basic algorithm.
no code implementations • NeurIPS 2007 • Claudio Gentile, Fabio Vitale, Cristian Brotto
A new algorithm for on-line learning linear-threshold functions is proposed which efficiently combines second-order statistics about the data with the logarithmic behavior" of multiplicative/dual-norm algorithms.