1 code implementation • 21 Mar 2024 • Mathieu Blondel, Vincent Roulet
Artificial intelligence has recently experienced remarkable advances, fueled by large models, vast datasets, accelerated hardware, and, last but not least, the transformative power of differentiable programming.
no code implementations • 30 Nov 2023 • Vincent Roulet, Atish Agarwala, Fabian Pedregosa
Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability, given a fixed stepsize (Cohen et al, 2022).
no code implementations • 21 Oct 2023 • Ronak Mehta, Vincent Roulet, Krishna Pillutla, Zaid Harchaoui
We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty.
no code implementations • 17 Aug 2023 • Vincent Roulet, Mathieu Blondel
Inspired by Gauss-Newton-like methods, we study the benefit of leveraging the structure of deep learning objectives, namely, the composition of a convex loss function and of a nonlinear network, in order to derive better direction oracles than stochastic gradients, based on the idea of partial linearization.
no code implementations • 18 May 2023 • Krishna Pillutla, Vincent Roulet, Sham Kakade, Zaid Harchaoui
Gauss-Newton methods and their stochastic version have been widely used in machine learning and signal processing.
1 code implementation • 10 Dec 2022 • Ronak Mehta, Vincent Roulet, Krishna Pillutla, Lang Liu, Zaid Harchaoui
Spectral risk objectives - also called $L$-risks - allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task.
1 code implementation • 13 Jul 2022 • Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel, Zaid Harchaoui
We present the implementation of nonlinear control algorithms based on linear and quadratic approximations of the objective from a functional viewpoint.
1 code implementation • 2 Dec 2021 • Vincent Roulet, Zaid Harchaoui
Target Propagation (TP) algorithms compute targets instead of gradients along neural networks and propagate them backward in a way that is similar yet different than gradient back-propagation (BP).
no code implementations • 31 Dec 2020 • Vincent Roulet, Zaid Harchaoui
The notion of a Moreau envelope is central to the analysis of first-order optimization algorithms for machine learning.
no code implementations • 20 Feb 2020 • Vincent Roulet, Zaid Harchaoui
We present an approach to obtain convergence guarantees of optimization algorithms for deep networks based on elementary arguments and computations.
1 code implementation • 30 Dec 2019 • Corinne Jones, Vincent Roulet, Zaid Harchaoui
We present a discriminative clustering approach in which the feature representation can be learned from data and moreover leverage labeled data.
1 code implementation • 19 Mar 2019 • Corinne Jones, Vincent Roulet, Zaid Harchaoui
Convolutional Neural Networks, as most artificial neural networks, are commonly viewed as methods different in essence from kernel-based methods.
1 code implementation • NeurIPS 2018 • Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui
We present a framework to train a structured prediction model by performing smoothing on the inference algorithm it builds upon.
no code implementations • NeurIPS 2017 • Vincent Roulet, Alexandre d'Aspremont
The {\L}ojasiewicz inequality shows that H\"olderian error bounds on the minimum of convex optimization problems hold almost generically.
no code implementations • NeurIPS 2017 • Damien Scieur, Vincent Roulet, Francis Bach, Alexandre d'Aspremont
We show that accelerated optimization methods can be seen as particular instances of multi-step integration schemes from numerical analysis, applied to the gradient flow equation.
no code implementations • 16 Jun 2015 • Vincent Roulet, Fajwel Fogel, Alexandre d'Aspremont, Francis Bach
We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation.