no code implementations • 12 Jan 2024 • Anh Dang, Reza Babanezhad, Sharan Vaswani
In particular, for strongly-convex quadratics with condition number $\kappa$, we prove that SHB with the standard step-size and momentum parameters results in an $O\left(\exp(-\frac{T}{\sqrt{\kappa}}) + \sigma \right)$ convergence rate, where $T$ is the number of iterations and $\sigma^2$ is the variance in the stochastic gradients.
1 code implementation • 25 May 2023 • Baojian Zhou, Yifan Sun, Reza Babanezhad
This paper studies the online node classification problem under a transductive learning setting.
1 code implementation • NeurIPS 2023 • Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux
Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.
1 code implementation • 6 Feb 2023 • Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux
Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.
1 code implementation • 11 Apr 2022 • Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup
We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of their primal and dual regret on online linear optimization problems.
no code implementations • 21 Oct 2021 • Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad
In order to be adaptive to the smoothness, we use a stochastic line-search (SLS) and show (via upper and lower-bounds) that SGD with SLS converges at the desired rate, but only to a neighbourhood of the solution.
no code implementations • 18 Feb 2021 • Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien
Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.
no code implementations • 23 Nov 2020 • Reza Babanezhad, Simon Lacoste-Julien
Mirror-prox (MP) is a well-known algorithm to solve variational inequality (VI) problems.
no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux
For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.
no code implementations • 25 Sep 2019 • Ousmane Amadou Dia, Elnaz Barshan, Reza Babanezhad
While progress has been made in crafting visually imperceptible adversarial examples, constructing semantically meaningful ones remains a challenge.
1 code implementation • NeurIPS 2019 • Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux
While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting.
no code implementations • 10 Mar 2019 • Ousmane Amadou Dia, Elnaz Barshan, Reza Babanezhad
While progress has been made in crafting visually imperceptible adversarial examples, constructing semantically meaningful ones remains a challenge.
1 code implementation • 6 Jul 2018 • Issam Laradji, Reza Babanezhad
Unsupervised domain adaptation techniques have been successful for a wide range of problems where supervised labels are limited.
no code implementations • 5 Nov 2015 • Reza Babanezhad, Mohamed Osama Ahmed, Alim Virani, Mark Schmidt, Jakub Konečný, Scott Sallinen
We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods.
no code implementations • 31 Oct 2015 • Mohammad Emtiyaz Khan, Reza Babanezhad, Wu Lin, Mark Schmidt, Masashi Sugiyama
We also give a convergence-rate analysis of our method and many other previous methods which exploit the geometry of the space.
no code implementations • 16 Apr 2015 • Mark Schmidt, Reza Babanezhad, Mohamed Osama Ahmed, Aaron Defazio, Ann Clifton, Anoop Sarkar
We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs).