Search Results for author: Karthik Sridharan

Found 49 papers, 3 papers with code

Online Learning with Unknown Constraints

no code implementations • 6 Mar 2024 • Karthik Sridharan, Seung Won Wilson Yoo

We consider the problem of online learning where the sequence of actions played by the learner must adhere to an unknown safety constraint at every round.

regression

Paper
Add Code

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

no code implementations • 24 Jul 2023 • Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

Imitation Learning Multi-Armed Bandits

Paper
Add Code

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

no code implementations • 13 Oct 2022 • Satyen Kale, Jason D. Lee, Chris De Sa, Ayush Sekhari, Karthik Sridharan

When these potentials further satisfy certain self-bounding properties, we show that they can be used to provide a convergence guarantee for Gradient Descent (GD) and SGD (even when the paths of GF and GD/SGD are quite far apart).

Retrieval

Paper
Add Code

On the Complexity of Adversarial Decision Making

no code implementations • 27 Jun 2022 • Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees.

Decision Making reinforcement-learning +1

Paper
Add Code

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

no code implementations • 19 Jun 2022 • Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

no code implementations • NeurIPS 2021 • Satyen Kale, Ayush Sekhari, Karthik Sridharan

We show that there is an SCO problem such that GD with any step size and number of iterations can only learn at a suboptimal rate: at least $\widetilde{\Omega}(1/n^{5/12})$.

Paper
Add Code

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations • NeurIPS 2021 • Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online learning with dynamics: A minimax perspective

no code implementations • NeurIPS 2020 • Kush Bhatia, Karthik Sridharan

In this setting, we study the problem of minimizing policy regret and provide non-constructive upper bounds on the minimax rate for the problem.

counterfactual

Paper
Add Code

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

no code implementations • 24 Jun 2020 • Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed.

Second-order methods Stochastic Optimization

Paper
Add Code

Reinforcement Learning with Feedback Graphs

no code implementations • NeurIPS 2020 • Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Hypothesis Set Stability and Generalization

no code implementations • NeurIPS 2019 • Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Our main result is a generalization bound for data-dependent hypothesis sets expressed in terms of a notion of hypothesis set stability and a notion of Rademacher complexity for data-dependent hypothesis sets that we introduce.

Paper
Add Code

Distributed Learning with Sublinear Communication

no code implementations • 28 Feb 2019 • Jayadev Acharya, Christopher De Sa, Dylan J. Foster, Karthik Sridharan

In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine.

Quantization

Paper
Add Code

The Complexity of Making the Gradient Small in Stochastic Convex Optimization

no code implementations • 13 Feb 2019 • Dylan J. Foster, Ayush Sekhari, Ohad Shamir, Nathan Srebro, Karthik Sridharan, Blake Woodworth

Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model.

Stochastic Optimization

Paper
Add Code

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

no code implementations • NeurIPS 2018 • Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We investigate 1) the rate at which refined properties of the empirical risk---in particular, gradients---converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization.

Paper
Add Code

Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals

1 code implementation • 11 Sep 2018 • Andrew Cotter, Heinrich Jiang, Serena Wang, Taman Narayan, Maya Gupta, Seungil You, Karthik Sridharan

This new formulation leads to an algorithm that produces a stochastic classifier by playing a two-player non-zero-sum game solving for what we call a semi-coarse correlated equilibrium, which in turn corresponds to an approximately optimal and feasible solution to the constrained optimization problem.

Fairness

182,678

Paper
Code

Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints

1 code implementation • 29 Jun 2018 • Andrew Cotter, Maya Gupta, Heinrich Jiang, Nathan Srebro, Karthik Sridharan, Serena Wang, Blake Woodworth, Seungil You

Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals.

Fairness

300

Paper
Code

Two-Player Games for Efficient Non-Convex Constrained Optimization

1 code implementation • 17 Apr 2018 • Andrew Cotter, Heinrich Jiang, Karthik Sridharan

For both the proxy-Lagrangian and Lagrangian formulations, however, we prove that this classifier, instead of having unbounded size, can be taken to be a distribution over no more than m+1 models (where m is the number of constraints).

BIG-bench Machine Learning Vocal Bursts Valence Prediction

300

Paper
Code

Logistic Regression: The Importance of Being Improper

no code implementations • 25 Mar 2018 • Dylan J. Foster, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Starting with the simple observation that the logistic loss is $1$-mixable, we design a new efficient improper learning algorithm for online logistic regression that circumvents the aforementioned lower bound with a regret bound exhibiting a doubly-exponential improvement in dependence on the predictor norm.

regression

Paper
Add Code

Online Learning: Sufficient Statistics and the Burkholder Method

no code implementations • 20 Mar 2018 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain "sufficient statistics" for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is only required to keep the sufficient statistics in memory.

Paper
Add Code

Parameter-free online learning via model selection

no code implementations • NeurIPS 2017 • Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan

We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning.

Model Selection

Paper
Add Code

Small-loss bounds for online learning with partial information

no code implementations • 9 Nov 2017 • Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We develop a black-box approach for such problems where the learner observes as feedback only losses of a subset of the actions that includes the selected action.

Multi-Armed Bandits

Paper
Add Code

ZigZag: A new approach to adaptive online learning

no code implementations • 13 Apr 2017 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

To develop a general theory of when this type of adaptive regret bound is achievable we establish a connection to the theory of decoupling inequalities for martingales in Banach spaces.

Paper
Add Code

Inference in Sparse Graphs with Pairwise Measurements and Side Information

no code implementations • 8 Mar 2017 • Dylan J. Foster, Daniel Reichman, Karthik Sridharan

For two-dimensional grids, our results improve over Globerson et al. (2015) by obtaining optimal recovery in the constant-height regime.

Learning Theory Tree Decomposition

Paper
Add Code

A Tutorial on Online Supervised Learning with Applications to Node Classification in Social Networks

no code implementations • 31 Aug 2016 • Alexander Rakhlin, Karthik Sridharan

We revisit the elegant observation of T. Cover '65 which, perhaps, is not as well-known to the broader community as it should be.

General Classification Node Classification

Paper
Add Code

Learning in Games: Robustness of Fast Convergence

no code implementations • NeurIPS 2016 • Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We show that learning algorithms satisfying a $\textit{low approximate regret}$ property experience fast convergence to approximate optimality in a large class of repeated games.

Paper
Add Code

BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits

no code implementations • 6 Feb 2016 • Alexander Rakhlin, Karthik Sridharan

We present efficient algorithms for the problem of contextual bandits with i. i. d.

Multi-Armed Bandits

Paper
Add Code

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

no code implementations • NeurIPS 2016 • Zeyuan Allen-Zhu, Yang Yuan, Karthik Sridharan

The amount of data available in the world is growing faster than our ability to deal with it.

BIG-bench Machine Learning Clustering

Paper
Add Code

Private Causal Inference

no code implementations • 17 Dec 2015 • Matt J. Kusner, Yu Sun, Karthik Sridharan, Kilian Q. Weinberger

Causal inference has the potential to have significant impact on medical research, prevention and control of diseases, and identifying factors that impact economic changes to name just a few.

Causal Inference

Paper
Add Code

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

no code implementations • 13 Oct 2015 • Alexander Rakhlin, Karthik Sridharan

We study an equivalence of (i) deterministic pathwise statements appearing in the online learning literature (termed \emph{regret bounds}), (ii) high-probability tail bounds for the supremum of a collection of martingales (of a specific form arising from uniform laws of large numbers for martingales), and (iii) in-expectation bounds for the supremum.

Paper
Add Code

Adaptive Online Learning

no code implementations • NeurIPS 2015 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds.

Model Selection

Paper
Add Code

Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints

no code implementations • 4 Mar 2015 • Alexander Rakhlin, Karthik Sridharan

We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints.

Paper
Add Code

Learning with Square Loss: Localization through Offset Rademacher Complexity

no code implementations • 21 Feb 2015 • Tengyuan Liang, Alexander Rakhlin, Karthik Sridharan

We consider regression with square loss and general classes of functions without the boundedness assumption.

regression

Paper
Add Code

Sequential Probability Assignment with Binary Alphabets and Large Classes of Experts

no code implementations • 29 Jan 2015 • Alexander Rakhlin, Karthik Sridharan

We analyze the problem of sequential probability assignment for binary outcomes with side information and logarithmic loss, where regret---or, redundancy---is measured with respect to a (possibly infinite) class of experts.

Paper
Add Code

Online Nonparametric Regression with General Loss Functions

no code implementations • 26 Jan 2015 • Alexander Rakhlin, Karthik Sridharan

This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses.

regression

Paper
Add Code

Online Optimization : Competing with Dynamic Comparators

no code implementations • 26 Jan 2015 • Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, Karthik Sridharan

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees.

Paper
Add Code

Online Nonparametric Regression

no code implementations • 11 Feb 2014 • Alexander Rakhlin, Karthik Sridharan

The optimal rates are shown to exhibit a phase transition analogous to the i. i. d./statistical learning case, studied in (Rakhlin, Sridharan, Tsybakov 2013).

regression

Paper
Add Code

Optimization, Learning, and Games with Predictable Sequences

no code implementations • NeurIPS 2013 • Alexander Rakhlin, Karthik Sridharan

We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences.

Paper
Add Code

Empirical entropy, minimax regret and minimax risk

no code implementations • 6 Aug 2013 • Alexander Rakhlin, Karthik Sridharan, Alexandre B. Tsybakov

Furthermore, for $p\in(0, 2)$, the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model.

Math regression

Paper
Add Code

Relax and Randomize : From Value to Algorithms

no code implementations • NeurIPS 2012 • Sasha Rakhlin, Ohad Shamir, Karthik Sridharan

We show a principled way of deriving online learning algorithms from a minimax analysis.

Matrix Completion Transductive Learning

Paper
Add Code

Online Learning with Predictable Sequences

no code implementations • 18 Aug 2012 • Alexander Rakhlin, Karthik Sridharan

Variance and path-length bounds can be seen as particular examples of online learning with simple predictable sequences.

Model Selection Time Series +1

Paper
Add Code

Better Mini-Batch Algorithms via Accelerated Gradient Methods

no code implementations • NeurIPS 2011 • Andrew Cotter, Ohad Shamir, Nati Srebro, Karthik Sridharan

Mini-batch algorithms have recently received significant attention as a way to speed-up stochastic convex optimization problems.

Paper
Add Code

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations • NeurIPS 2011 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

Paper
Add Code

On the Universality of Online Mirror Descent

no code implementations • NeurIPS 2011 • Nati Srebro, Karthik Sridharan, Ambuj Tewari

We show that for a general class of convex online learning problems, Mirror Descent can always achieve a (nearly) optimal regret guarantee.

Paper
Add Code

Online Learning: Random Averages, Combinatorial Parameters, and Learnability

no code implementations • NeurIPS 2010 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We develop a theory of online learning by defining several complexity measures.

Learning Theory

Paper
Add Code

Smoothness, Low Noise and Fast Rates

no code implementations • NeurIPS 2010 • Nathan Srebro, Karthik Sridharan, Ambuj Tewari

We establish an excess risk bound of O(H R_n^2 + sqrt{H L*} R_n) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis class.

Paper
Add Code

Online Learning via Sequential Complexities

no code implementations • 6 Jun 2010 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game.

Learning Theory

Paper
Add Code

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

no code implementations • 31 Oct 2009 • Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari

The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model.

Vocal Bursts Intensity Prediction

Paper
Add Code

On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization

no code implementations • NeurIPS 2008 • Sham M. Kakade, Karthik Sridharan, Ambuj Tewari

We provide sharp bounds for Rademacher and Gaussian complexities of (constrained) linear classes.

Paper
Add Code

Fast Rates for Regularized Objectives

no code implementations • NeurIPS 2008 • Karthik Sridharan, Shai Shalev-Shwartz, Nathan Srebro

We show that the empirical minimizer of a stochastic strongly convex objective, where the stochastic component is linear, converges to the population minimizer with rate $O(1/n)$.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.