Search Results for author: Alexander Rakhlin

Found 74 papers, 6 papers with code

The Power of Resets in Online Reinforcement Learning

no code implementations • 23 Apr 2024 • Zakaria Mhammedi, Dylan J. Foster, Alexander Rakhlin

We use local simulator access to unlock new statistical guarantees that were previously out of reach: - We show that MDPs with low coverability Xie et al. 2023 -- a general structural condition that subsumes Block MDPs and Low-Rank MDPs -- can be learned in a sample-efficient fashion with only $Q^{\star}$-realizability (realizability of the optimal state-value function); existing online RL algorithms require significantly stronger representation conditions.

reinforcement-learning

Paper
Add Code

Online Estimation via Offline Estimation: An Information-Theoretic Framework

no code implementations • 15 Apr 2024 • Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin

Our main results settle the statistical and computational complexity of online estimation in this framework.

Decision Making Density Estimation

Paper
Add Code

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

no code implementations • 25 Mar 2024 • Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei

We revisit the problem of offline reinforcement learning with value function realizability but without Bellman completeness.

reinforcement-learning

Paper
Add Code

On the Performance of Empirical Risk Minimization with Smoothed Data

no code implementations • 22 Feb 2024 • Adam Block, Alexander Rakhlin, Abhishek Shetty

In order to circumvent statistical and computational hardness results in sequential decision-making, recent work has considered smoothed online learning, where the distribution of data at each time is assumed to have bounded likeliehood ratio with respect to a base measure when conditioned on the history.

Decision Making

Paper
Add Code

Foundations of Reinforcement Learning and Interactive Decision Making

no code implementations • 27 Dec 2023 • Dylan J. Foster, Alexander Rakhlin

These lecture notes give a statistical perspective on the foundations of reinforcement learning and interactive decision making.

Decision Making Multi-Armed Bandits +1

Paper
Add Code

Efficient Model-Free Exploration in Low-Rank MDPs

no code implementations • NeurIPS 2023 • Zakaria Mhammedi, Adam Block, Dylan J. Foster, Alexander Rakhlin

A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required.

Representation Learning

Paper
Add Code

On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring

no code implementations • 1 May 2023 • Dylan J. Foster, Dean P. Foster, Noah Golowich, Alexander Rakhlin

Compared to the best results for the single-agent setting, our bounds have additional gaps.

Decision Making Multi-agent Reinforcement Learning

Paper
Add Code

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

1 code implementation • 12 Apr 2023 • Zakaria Mhammedi, Dylan J. Foster, Alexander Rakhlin

We address these issues by providing the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level, with minimal statistical assumptions.

Representation Learning

Paper
Code

Tight Bounds for $γ$-Regret via the Decision-Estimation Coefficient

no code implementations • 6 Mar 2023 • Margalit Glasgow, Alexander Rakhlin

Our lower bound shows that the $\gamma$-DEC is a fundamental limit for any model class $\mathcal{F}$: for any algorithm, there exists some $f \in \mathcal{F}$ for which the $\gamma$-regret of that algorithm scales (nearly) with the $\gamma$-DEC of $\mathcal{F}$.

Paper
Add Code

Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making

no code implementations • 10 Feb 2023 • Adam Block, Alexander Rakhlin, Max Simchowitz

Smoothed online learning has emerged as a popular framework to mitigate the substantial loss in statistical and computational complexity that arises when one moves from classical to adversarial learning.

Decision Making Econometrics

Paper
Add Code

On the Complexity of Adversarial Decision Making

no code implementations • 27 Jun 2022 • Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees.

Decision Making reinforcement-learning +1

Paper
Add Code

Damped Online Newton Step for Portfolio Selection

no code implementations • 15 Feb 2022 • Zakaria Mhammedi, Alexander Rakhlin

In this paper, we build on the recent work by Haipeng et al. 2018 and present the first practical online portfolio selection algorithm with a logarithmic regret and whose per-round time and space complexities depend only logarithmically on the horizon.

Paper
Add Code

Smoothed Online Learning is as Easy as Statistical Learning

no code implementations • 9 Feb 2022 • Adam Block, Yuval Dagan, Noah Golowich, Alexander Rakhlin

We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning.

Learning Theory Multi-Armed Bandits

Paper
Add Code

The Statistical Complexity of Interactive Decision Making

no code implementations • 27 Dec 2021 • Dylan J. Foster, Sham M. Kakade, Jian Qian, Alexander Rakhlin

The main result of this work provides a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.

Decision Making reinforcement-learning +1

Paper
Add Code

On Submodular Contextual Bandits

no code implementations • 3 Dec 2021 • Dean P. Foster, Alexander Rakhlin

We consider the problem of contextual bandits where actions are subsets of a ground set and mean rewards are modeled by an unknown monotone submodular function that belongs to a class $\mathcal{F}$.

Multi-Armed Bandits

Paper
Add Code

Intrinsic Dimension Estimation Using Wasserstein Distances

no code implementations • 8 Jun 2021 • Adam Block, Zeyu Jia, Yury Polyanskiy, Alexander Rakhlin

It has long been thought that high-dimensional data encountered in many practical machine learning tasks have low-dimensional structure, i. e., the manifold hypothesis holds.

BIG-bench Machine Learning

Paper
Add Code

Deep learning: a statistical viewpoint

no code implementations • 16 Mar 2021 • Peter L. Bartlett, Andrea Montanari, Alexander Rakhlin

We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting.

Paper
Add Code

On the Minimal Error of Empirical Risk Minimization

no code implementations • 24 Feb 2021 • Gil Kur, Alexander Rakhlin

We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings.

regression

Paper
Add Code

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

1 code implementation • 15 Feb 2021 • Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.

Computational Efficiency Extreme Multi-Label Classification +2

Paper
Code

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control

Paper
Add Code

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

no code implementations • 7 Oct 2020 • Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm.

Active Learning Multi-Armed Bandits +2

Paper
Add Code

Fast Mixing of Multi-Scale Langevin Dynamics under the Manifold Hypothesis

no code implementations • 19 Jun 2020 • Adam Block, Youssef Mroueh, Alexander Rakhlin, Jerret Ross

Recently, the task of image generation has attracted much attention.

Image Generation

Paper
Add Code

On Suboptimality of Least Squares with Application to Estimation of Convex Bodies

no code implementations • 7 Jun 2020 • Gil Kur, Alexander Rakhlin, Adityanand Guntuboyina

We develop a technique for establishing lower bounds on the sample complexity of Least Squares (or, Empirical Risk Minimization) for large classes of functions.

Paper
Add Code

Learning nonlinear dynamical systems from a single trajectory

no code implementations • L4DC 2020 • Dylan J. Foster, Alexander Rakhlin, Tuhin Sarkar

We introduce algorithms for learning nonlinear dynamical systems of the form $x_{t+1}=\sigma(\Theta^{\star}x_t)+\varepsilon_t$, where $\Theta^{\star}$ is a weight matrix, $\sigma$ is a nonlinear link function, and $\varepsilon_t$ is a mean-zero noise process.

Paper
Add Code

Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

no code implementations • ICML 2020 • Dylan J. Foster, Alexander Rakhlin

We characterize the minimax rates for contextual bandits with general, potentially nonparametric function classes, and show that our algorithm is minimax optimal whenever the oracle obtains the optimal rate for regression.

Multi-Armed Bandits regression

Paper
Add Code

Generative Modeling with Denoising Auto-Encoders and Langevin Sampling

no code implementations • 31 Jan 2020 • Adam Block, Youssef Mroueh, Alexander Rakhlin

We show that both DAE and DSM provide estimates of the score of the Gaussian smoothed population density, allowing us to apply the machinery of Empirical Processes.

Denoising

Paper
Add Code

$\ell_{\infty}$ Vector Contraction for Rademacher Complexity

no code implementations • 15 Nov 2019 • Dylan J. Foster, Alexander Rakhlin

We show that the Rademacher complexity of any $\mathbb{R}^{K}$-valued function class composed with an $\ell_{\infty}$-Lipschitz function is bounded by the maximum Rademacher complexity of the restriction of the function class along each coordinate, times a factor of $\tilde{O}(\sqrt{K})$.

Paper
Add Code

On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels

no code implementations • 27 Aug 2019 • Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai

We study the risk of minimum-norm interpolants of data in Reproducing Kernel Hilbert Spaces.

Paper
Add Code

Using effective dimension to analyze feature transformations in deep neural networks

no code implementations • ICML Workshop Deep_Phenomen 2019 • Kavya Ravichandran, Ajay Jain, Alexander Rakhlin

In a typical deep learning approach to a computer vision task, Convolutional Neural Networks (CNNs) are used to extract features at varying levels of abstraction from an image and compress a high dimensional input into a lower dimensional decision space through a series of transformations.

Paper
Add Code

Breast Tumor Cellularity Assessment using Deep Neural Networks

no code implementations • 5 May 2019 • Alexander Rakhlin, Aleksei Tiulpin, Alexey A. Shvets, Alexandr A. Kalinin, Vladimir I. Iglovikov, Sergey Nikolenko

Breast cancer is one of the main causes of death worldwide.

whole slide images

Paper
Add Code

Optimality of Maximum Likelihood for Log-Concave Density Estimation and Bounded Convex Regression

no code implementations • 13 Mar 2019 • Gil Kur, Yuval Dagan, Alexander Rakhlin

In this paper, we study two problems: (1) estimation of a $d$-dimensional log-concave distribution and (2) bounded multivariate convex regression with random design with an underlying log-concave density or a compactly supported distribution with a continuous density.

Density Estimation regression

Paper
Add Code

Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon

no code implementations • 28 Dec 2018 • Alexander Rakhlin, Xiyu Zhai

We show that minimum-norm interpolation in the Reproducing Kernel Hilbert Space corresponding to the Laplace kernel is not consistent if input dimension is constant.

Vocal Bursts Intensity Prediction

Paper
Add Code

Just Interpolate: Kernel "Ridgeless" Regression Can Generalize

no code implementations • 1 Aug 2018 • Tengyuan Liang, Alexander Rakhlin

In the absence of explicit regularization, Kernel "Ridgeless" Regression with nonlinear kernels has the potential to fit the training data perfectly.

regression

Paper
Add Code

Does data interpolation contradict statistical optimality?

no code implementations • 25 Jun 2018 • Mikhail Belkin, Alexander Rakhlin, Alexandre B. Tsybakov

We show that learning methods interpolating the training data can achieve optimal rates for the problems of nonparametric regression and prediction with square loss.

regression

Paper
Add Code

Angiodysplasia Detection and Localization Using Deep Convolutional Neural Networks

1 code implementation • 21 Apr 2018 • Alexey Shvets, Vladimir Iglovikov, Alexander Rakhlin, Alexandr A. Kalinin

Accurate detection and localization for angiodysplasia lesions is an important problem in early stage diagnostics of gastrointestinal bleeding and anemia.

Paper
Code

Online Learning: Sufficient Statistics and the Burkholder Method

no code implementations • 20 Mar 2018 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain "sufficient statistics" for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is only required to keep the sufficient statistics in memory.

Paper
Add Code

Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning

2 code implementations • 3 Mar 2018 • Alexey Shvets, Alexander Rakhlin, Alexandr A. Kalinin, Vladimir Iglovikov

Semantic segmentation of robotic instruments is an important problem for the robot-assisted surgery.

Pose Estimation Segmentation +1

616

Paper
Code

Deep Convolutional Neural Networks for Breast Cancer Histology Image Analysis

3 code implementations • 2 Feb 2018 • Alexander Rakhlin, Alexey Shvets, Vladimir Iglovikov, Alexandr A. Kalinin

In this work, we develop the computational approach based on deep convolution neural networks for breast cancer histology image classification.

Breast Cancer Detection Classification +4

189

Paper
Code

Theory of Deep Learning IIb: Optimization Properties of SGD

no code implementations • 7 Jan 2018 • Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

Paper
Add Code

Size-Independent Sample Complexity of Neural Networks

no code implementations • 18 Dec 2017 • Noah Golowich, Alexander Rakhlin, Ohad Shamir

We study the sample complexity of learning neural networks, by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer.

Paper
Add Code

Pediatric Bone Age Assessment Using Deep Convolutional Neural Networks

no code implementations • 13 Dec 2017 • Vladimir Iglovikov, Alexander Rakhlin, Alexandr Kalinin, Alexey Shvets

Skeletal bone age assessment is a common clinical practice to diagnose endocrine and metabolic disorders in child development.

Paper
Add Code

Fisher-Rao Metric, Geometry, and Complexity of Neural Networks

1 code implementation • 5 Nov 2017 • Tengyuan Liang, Tomaso Poggio, Alexander Rakhlin, James Stokes

We study the relationship between geometry and capacity measures for deep neural networks from an invariance viewpoint.

LEMMA

Paper
Code

Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information

no code implementations • 12 Sep 2017 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix.

Community Detection

Paper
Add Code

ZigZag: A new approach to adaptive online learning

no code implementations • 13 Apr 2017 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

To develop a general theory of when this type of adaptive regret bound is achievable we establish a connection to the theory of decoupling inequalities for martingales in Banach spaces.

Paper
Add Code

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

no code implementations • 13 Feb 2017 • Maxim Raginsky, Alexander Rakhlin, Matus Telgarsky

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Stochastic Gradient Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate of the gradient at each iteration.

Paper
Add Code

A Tutorial on Online Supervised Learning with Applications to Node Classification in Social Networks

no code implementations • 31 Aug 2016 • Alexander Rakhlin, Karthik Sridharan

We revisit the elegant observation of T. Cover '65 which, perhaps, is not as well-known to the broader community as it should be.

General Classification Node Classification

Paper
Add Code

On Detection and Structural Reconstruction of Small-World Random Networks

no code implementations • 21 Apr 2016 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

In this paper, we study detection and fast reconstruction of the celebrated Watts-Strogatz (WS) small-world random graph model \citep{watts1998collective} which aims to describe real-world complex networks that exhibit both high clustering and short average length properties.

Clustering

Paper
Add Code

Inference via Message Passing on Partially Labeled Stochastic Block Models

no code implementations • 22 Mar 2016 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

We study the community detection and recovery problem in partially-labeled stochastic block models (SBM).

Community Detection

Paper
Add Code

Distributed Estimation of Dynamic Parameters : Regret Analysis

no code implementations • 2 Mar 2016 • Shahin Shahrampour, Alexander Rakhlin, Ali Jadbabaie

To this end, we use a notion of dynamic regret which suits the online, non-stationary nature of the problem.

Paper
Add Code

BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits

no code implementations • 6 Feb 2016 • Alexander Rakhlin, Karthik Sridharan

We present efficient algorithms for the problem of contextual bandits with i. i. d.

Multi-Armed Bandits

Paper
Add Code

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

no code implementations • 13 Oct 2015 • Alexander Rakhlin, Karthik Sridharan

We study an equivalence of (i) deterministic pathwise statements appearing in the online learning literature (termed \emph{regret bounds}), (ii) high-probability tail bounds for the supremum of a collection of martingales (of a specific form arising from uniform laws of large numbers for martingales), and (iii) in-expectation bounds for the supremum.

Paper
Add Code

Adaptive Online Learning

no code implementations • NeurIPS 2015 • Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds.

Model Selection

Paper
Add Code

Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints

no code implementations • 4 Mar 2015 • Alexander Rakhlin, Karthik Sridharan

We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints.

Paper
Add Code

Learning with Square Loss: Localization through Offset Rademacher Complexity

no code implementations • 21 Feb 2015 • Tengyuan Liang, Alexander Rakhlin, Karthik Sridharan

We consider regression with square loss and general classes of functions without the boundedness assumption.

regression

Paper
Add Code

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

no code implementations • 6 Feb 2015 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

The second threshold, $\sf SNR_s$, captures the statistical boundary, below which no method can succeed with probability going to one in the minimax sense.

Computational Efficiency

Paper
Add Code

Sequential Probability Assignment with Binary Alphabets and Large Classes of Experts

no code implementations • 29 Jan 2015 • Alexander Rakhlin, Karthik Sridharan

We analyze the problem of sequential probability assignment for binary outcomes with side information and logarithmic loss, where regret---or, redundancy---is measured with respect to a (possibly infinite) class of experts.

Paper
Add Code

Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions

no code implementations • 28 Jan 2015 • Alexandre Belloni, Tengyuan Liang, Hariharan Narayanan, Alexander Rakhlin

We consider the problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations.

Paper
Add Code

Online Optimization : Competing with Dynamic Comparators

no code implementations • 26 Jan 2015 • Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, Karthik Sridharan

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees.

Paper
Add Code

Online Nonparametric Regression with General Loss Functions

no code implementations • 26 Jan 2015 • Alexander Rakhlin, Karthik Sridharan

This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses.

regression

Paper
Add Code

Distributed Detection : Finite-time Analysis and Impact of Network Topology

no code implementations • 30 Sep 2014 • Shahin Shahrampour, Alexander Rakhlin, Ali Jadbabaie

In contrast to the existing literature which focuses on asymptotic learning, we provide a finite-time analysis.

Paper
Add Code

Geometric Inference for General High-Dimensional Linear Inverse Problems

no code implementations • 17 Apr 2014 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

This paper presents a unified geometric framework for the statistical analysis of a general ill-posed linear inverse model which includes as special cases noisy compressed sensing, sign vector recovery, trace regression, orthogonal matrix estimation, and noisy matrix completion.

Matrix Completion regression +2

Paper
Add Code

On Zeroth-Order Stochastic Convex Optimization via Random Walks

no code implementations • 11 Feb 2014 • Tengyuan Liang, Hariharan Narayanan, Alexander Rakhlin

The method is based on a random walk (the \emph{Ball Walk}) on the epigraph of the function.

Paper
Add Code

Online Nonparametric Regression

no code implementations • 11 Feb 2014 • Alexander Rakhlin, Karthik Sridharan

The optimal rates are shown to exhibit a phase transition analogous to the i. i. d./statistical learning case, studied in (Rakhlin, Sridharan, Tsybakov 2013).

regression

Paper
Add Code

Optimization, Learning, and Games with Predictable Sequences

no code implementations • NeurIPS 2013 • Alexander Rakhlin, Karthik Sridharan

We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences.

Paper
Add Code

Online Learning of Dynamic Parameters in Social Networks

no code implementations • NeurIPS 2013 • Shahin Shahrampour, Alexander Rakhlin, Ali Jadbabaie

Based on the decomposition of the global loss function, we introduce two update mechanisms, each of which generates an estimate of the true state.

Paper
Add Code

Efficient Sampling from Time-Varying Log-Concave Distributions

no code implementations • 23 Sep 2013 • Hariharan Narayanan, Alexander Rakhlin

Within the context of exponential families, the proposed method produces samples from a posterior distribution which is updated as data arrive in a streaming fashion.

Paper
Add Code

Empirical entropy, minimax regret and minimax risk

no code implementations • 6 Aug 2013 • Alexander Rakhlin, Karthik Sridharan, Alexandre B. Tsybakov

Furthermore, for $p\in(0, 2)$, the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model.

Math regression

Paper
Add Code

Online Learning with Predictable Sequences

no code implementations • 18 Aug 2012 • Alexander Rakhlin, Karthik Sridharan

Variance and path-length bounds can be seen as particular examples of online learning with simple predictable sequences.

Model Selection Time Series +1

Paper
Add Code

Lower Bounds for Passive and Active Learning

no code implementations • NeurIPS 2011 • Maxim Raginsky, Alexander Rakhlin

For passive learning, our lower bounds match the upper bounds of Gine and Koltchinskii up to constants and generalize analogous results of Massart and Nedelec.

Active Learning

Paper
Add Code

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations • NeurIPS 2011 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

Paper
Add Code

Stochastic convex optimization with bandit feedback

no code implementations • NeurIPS 2011 • Alekh Agarwal, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Alexander Rakhlin

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model.