no code implementations • ICML 2020 • Damien Scieur, Fabian Pedregosa
We consider the average-case runtime analysis of algorithms for minimizing quadratic objectives.
no code implementations • ICML 2020 • Fabian Pedregosa, Damien Scieur
We develop a framework for designing optimal optimization methods in terms of their average-case runtime.
1 code implementation • 21 Feb 2024 • Sanjeev Raja, Ishan Amin, Fabian Pedregosa, Aditi S. Krishnapriyan
As a general framework applicable across NNIP architectures and systems, StABlE Training is a powerful tool for training stable and accurate NNIPs, particularly in the absence of large reference datasets.
no code implementations • 30 Nov 2023 • Vincent Roulet, Atish Agarwala, Fabian Pedregosa
Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability, given a fixed stepsize (Cohen et al, 2022).
no code implementations • 28 Dec 2022 • Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse
Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively.
no code implementations • 8 Dec 2022 • Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare
In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns.
no code implementations • 9 Nov 2022 • Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa
The extragradient method has gained popularity due to its robust convergence properties for differentiable games.
no code implementations • 10 Oct 2022 • Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington
Recent studies of gradient descent with large step sizes have shown that there is often a regime with an initial increase in the largest eigenvalue of the loss Hessian (progressive sharpening), followed by a stabilization of the eigenvalue near the maximum value which allows convergence (edge of stability).
no code implementations • 27 Sep 2022 • Damien Scieur, Quentin Bertrand, Gauthier Gidel, Fabian Pedregosa
Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few.
no code implementations • 20 Jun 2022 • Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette
The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results.
no code implementations • 24 Feb 2022 • Robert M. Gower, Mathieu Blondel, Nidham Gazagnadou, Fabian Pedregosa
We use this insight to develop new variants of the SPS method that are better suited to nonlinear models.
1 code implementation • ICLR 2022 • Utku Evci, Bart van Merriënboer, Thomas Unterthiner, Max Vladymyrov, Fabian Pedregosa
The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified.
1 code implementation • NeurIPS 2021 • Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert
In this paper, we propose automatic implicit differentiation, an efficient and modular approach for implicit differentiation of optimization problems.
no code implementations • 19 May 2021 • Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch
Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution.
1 code implementation • 17 Feb 2021 • Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux
We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.
no code implementations • 8 Feb 2021 • Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette
We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large.
no code implementations • ICLR 2021 • Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur
First, we show that for zero-sum bilinear games the average-case optimal method is the optimal method for the minimization of the Hamiltonian.
no code implementations • 8 Jun 2020 • Courtney Paquette, Bart van Merriënboer, Elliot Paquette, Fabian Pedregosa
In fact, the halting time exhibits a universality property: it is independent of the probability distribution.
1 code implementation • ICML 2020 • Geoffrey Négiar, Gideon Dresdner, Alicia Tsai, Laurent El Ghaoui, Francesco Locatello, Robert M. Freund, Fabian Pedregosa
We propose a novel Stochastic Frank-Wolfe (a. k. a.
no code implementations • ICLR 2020 • Lukas Balles, Fabian Pedregosa, Nicolas Le Roux
Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training.
no code implementations • 12 Feb 2020 • Fabian Pedregosa, Damien Scieur
We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis.
1 code implementation • 8 Oct 2019 • Elena Kalinina, Fabian Pedregosa, Vittorio Iacovella, Emanuele Olivetti, Paolo Avesani
In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen
Additionally, our attempts to find a decreasing objective path from "bad" solutions to the "good" ones in the sparse subspace fail.
no code implementations • 18 Jun 2019 • Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux
The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.
1 code implementation • 19 Jun 2018 • Fabian Pedregosa, Kilian Fatras, Mattia Casotto
This is due to the fact that existing methods require to evaluate the proximity operator for the nonsmooth terms, which can be a costly operation for complex penalties.
Optimization and Control 65K10
no code implementations • 9 Apr 2018 • Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien
In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints.
no code implementations • ICML 2018 • Fabian Pedregosa, Gauthier Gidel
We propose and analyze an adaptive step-size variant of the Davis-Yin three operator splitting.
no code implementations • ICML 2018 • Thomas Kerdreux, Fabian Pedregosa, Alexandre d'Aspremont
The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart.
no code implementations • 11 Jan 2018 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien
Notably, we prove that ASAGA and KROMAGNON can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions.
1 code implementation • NeurIPS 2017 • Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien
Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures.
no code implementations • 25 Oct 2016 • Fabian Pedregosa
The three operator splitting scheme was recently proposed by [Davis and Yin, 2015] as a method to optimize composite objective functions with one convex smooth term and two convex (possibly non-smooth) terms for which we have access to their proximity operator.
1 code implementation • 15 Jun 2016 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien
We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates.
1 code implementation • 7 Feb 2016 • Fabian Pedregosa
Most models in machine learning contain at least one hyperparameter to control for model complexity.
1 code implementation • 12 Dec 2014 • Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Muller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gäel Varoquaux
Statistical machine learning methods are increasingly used for neuroimaging data analysis.
no code implementations • 11 Aug 2014 • Fabian Pedregosa, Francis Bach, Alexandre Gramfort
We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification.
no code implementations • 27 Feb 2014 • Fabian Pedregosa, Michael Eickenberg, Philippe Ciuciu, Bertrand Thirion, Alexandre Gramfort
We develop a method for the joint estimation of activation and HRF using a rank constraint causing the estimated HRF to be equal across events/conditions, yet permitting it to be different across voxels.
4 code implementations • 1 Sep 2013 • Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, Gaël Varoquaux
Scikit-learn is an increasingly popular machine learning li- brary.
no code implementations • 10 Aug 2013 • Michael Eickenberg, Fabian Pedregosa, Senoussi Mehdi, Alexandre Gramfort, Bertrand Thirion
Second layer scattering descriptors are known to provide good classification performance on natural quasi-stationary processes such as visual textures due to their sensitivity to higher order moments and continuity with respect to small deformations.
no code implementations • 13 May 2013 • Fabian Pedregosa, Michael Eickenberg, Bertrand Thirion, Alexandre Gramfort
Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal.
3 code implementations • 2 Jan 2012 • Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.