Search Results for author: Sharan Vaswani

Found 27 papers, 11 papers with code

From Inverse Optimization to Feasibility to ERM

no code implementations • 27 Feb 2024 • Saurabh Mishra, Anant Raj, Sharan Vaswani

For a linear prediction model, we reduce CILP to a convex feasibility problem allowing the use of standard algorithms such as alternating projections.

Paper
Add Code

Noise-adaptive (Accelerated) Stochastic Heavy-Ball Momentum

no code implementations • 12 Jan 2024 • Anh Dang, Reza Babanezhad, Sharan Vaswani

In particular, for strongly-convex quadratics with condition number $\kappa$, we prove that SHB with the standard step-size and momentum parameters results in an $O\left(\exp(-\frac{T}{\sqrt{\kappa}}) + \sigma \right)$ convergence rate, where $T$ is the number of iterations and $\sigma^2$ is the variance in the stochastic gradients.

Paper
Add Code

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

1 code implementation • NeurIPS 2023 • Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux

Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.

Reinforcement Learning (RL)

Paper
Code

Target-based Surrogates for Stochastic Optimization

1 code implementation • 6 Feb 2023 • Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux

Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.

Imitation Learning Stochastic Optimization

Paper
Code

Improved Policy Optimization for Online Imitation Learning

1 code implementation • 29 Jul 2022 • Jonathan Wilder Lavington, Sharan Vaswani, Mark Schmidt

Specifically, if the class of policies is sufficiently expressive to contain the expert policy, we prove that DAGGER achieves constant regret.

Imitation Learning

Paper
Code

Near-Optimal Sample Complexity Bounds for Constrained MDPs

no code implementations • 13 Jun 2022 • Sharan Vaswani, Lin F. Yang, Csaba Szepesvári

In particular, we design a model-based algorithm that addresses two settings: (i) relaxed feasibility, where small constraint violations are allowed, and (ii) strict feasibility, where the output policy is required to satisfy the constraint.

Paper
Add Code

Towards Painless Policy Optimization for Constrained MDPs

1 code implementation • 11 Apr 2022 • Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup

We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of their primal and dual regret on online linear optimization problems.

Paper
Code

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

no code implementations • 21 Oct 2021 • Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad

In order to be adaptive to the smoothness, we use a stochastic line-search (SLS) and show (via upper and lower-bounds) that SGD with SLS converges at the desired rate, but only to a neighbourhood of the solution.

Paper
Add Code

A general class of surrogate functions for stable and efficient reinforcement learning

1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux

Common policy gradient methods rely on the maximization of a sequence of surrogate functions.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

SVRG Meets AdaGrad: Painless Variance Reduction

no code implementations • 18 Feb 2021 • Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.

Paper
Add Code

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

no code implementations • 28 Sep 2020 • Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.

Binary Classification

Paper
Add Code

To Each Optimizer a Norm, To Each Norm its Generalization

no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux

For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.

Classification General Classification

Paper
Add Code

Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)

1 code implementation • 11 Jun 2020 • Sharan Vaswani, Issam Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

In this setting, we prove that AMSGrad with constant step-size and momentum converges to the minimizer at a faster $O(1/T)$ rate.

Binary Classification Multi-class Classification

Paper
Code

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

1 code implementation • 24 Feb 2020 • Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien

Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD).

136

Paper
Code

Old Dog Learns New Tricks: Randomized UCB for Bandit Problems

1 code implementation • 11 Oct 2019 • Sharan Vaswani, Abbas Mehrabian, Audrey Durand, Branislav Kveton

We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation.

Thompson Sampling

Paper
Code

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

1 code implementation • 11 Oct 2019 • Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.

Binary Classification Second-order methods

Paper
Code

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates

1 code implementation • NeurIPS 2019 • Sharan Vaswani, Aaron Mishkin, Issam Laradji, Mark Schmidt, Gauthier Gidel, Simon Lacoste-Julien

To improve the proposed methods' practical performance, we give heuristics to use larger step-sizes and acceleration.

General Classification Multi-class Classification

118

Paper
Code

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits

no code implementations • 13 Nov 2018 • Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Mohammad Ghavamzadeh, Tor Lattimore

Specifically, it pulls the arm with the highest mean reward in a non-parametric bootstrap sample of its history with pseudo rewards.

Multi-Armed Bandits

Paper
Add Code

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

no code implementations • 16 Oct 2018 • Sharan Vaswani, Francis Bach, Mark Schmidt

Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov acceleration matches the convergence rate of the deterministic accelerated method for both convex and strongly-convex functions.

Paper
Add Code

Combining Bayesian Optimization and Lipschitz Optimization

no code implementations • 10 Oct 2018 • Mohamed Osama Ahmed, Sharan Vaswani, Mark Schmidt

Indeed, in a particular setting, we prove that using the Lipschitz information yields the same or a better bound on the regret compared to using Bayesian optimization on its own.

Bayesian Optimization Thompson Sampling

Paper
Add Code

New Insights into Bootstrapping for Bandits

no code implementations • 24 May 2018 • Sharan Vaswani, Branislav Kveton, Zheng Wen, Anup Rao, Mark Schmidt, Yasin Abbasi-Yadkori

We investigate the use of bootstrapping in the bandit setting.

Thompson Sampling

Paper
Add Code

Horde of Bandits using Gaussian Markov Random Fields

no code implementations • 7 Mar 2017 • Sharan Vaswani, Mark Schmidt, Laks. V. S. Lakshmanan

The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph.

Clustering Multi-Armed Bandits +2

Paper
Add Code

Model-Independent Online Learning for Influence Maximization

no code implementations • ICML 2017 • Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, Mark Schmidt

We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of "seed" users to expose the product to.

Paper
Add Code

Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback

1 code implementation • NeurIPS 2017 • Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani

Specifically, we aim to learn the set of "best influencers" in a social network online while repeatedly interacting with it.

Paper
Code

Adaptive Influence Maximization in Social Networks: Why Commit when You can Adapt?

no code implementations • 27 Apr 2016 • Sharan Vaswani, Laks. V. S. Lakshmanan

A disadvantage of this setting is that the marketer is forced to select all the seeds based solely on a diffusion model.

Social and Information Networks

Paper
Add Code

Influence Maximization with Bandits

no code implementations • 27 Feb 2015 • Sharan Vaswani, Laks. V. S. Lakshmanan, Mark Schmidt

We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to.

Paper
Add Code

Fast 3D Salient Region Detection in Medical Images using GPUs

no code implementations • 24 Oct 2013 • Rahul Thota, Sharan Vaswani, Amit Kale, Nagavijayalakshmi Vydyanathan

This allows us to initialize a sparse seed-point grid as the set of tentative salient region centers and iteratively converge to the local entropy maxima, thereby reducing the computation complexity compared to the Kadir Brady approach of performing this computation at every point in the image.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.