Search Results for author: Fabian Pedregosa

Found 40 papers, 13 papers with code

Universal Asymptotic Optimality of Polyak Momentum

no code implementations • ICML 2020 • Damien Scieur, Fabian Pedregosa

We consider the average-case runtime analysis of algorithms for minimizing quadratic objectives.

Paper
Add Code

Acceleration through spectral density estimation

no code implementations • ICML 2020 • Fabian Pedregosa, Damien Scieur

We develop a framework for designing optimal optimization methods in terms of their average-case runtime.

Density Estimation regression

Paper
Add Code

Stability-Aware Training of Neural Network Interatomic Potentials with Differentiable Boltzmann Estimators

1 code implementation • 21 Feb 2024 • Sanjeev Raja, Ishan Amin, Fabian Pedregosa, Aditi S. Krishnapriyan

As a general framework applicable across NNIP architectures and systems, StABlE Training is a powerful tool for training stable and accurate NNIPs, particularly in the absence of large reference datasets.

Paper
Code

On the Interplay Between Stepsize Tuning and Progressive Sharpening

no code implementations • 30 Nov 2023 • Vincent Roulet, Atish Agarwala, Fabian Pedregosa

Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability, given a fixed stepsize (Cohen et al, 2022).

Paper
Add Code

On Implicit Bias in Overparameterized Bilevel Optimization

no code implementations • 28 Dec 2022 • Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse

Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Add Code

A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces

no code implementations • 8 Dec 2022 • Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns.

Image Compression reinforcement-learning +1

Paper
Add Code

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

no code implementations • 9 Nov 2022 • Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

The extragradient method has gained popularity due to its robust convergence properties for differentiable games.

Paper
Add Code

Second-order regression models exhibit progressive sharpening to the edge of stability

no code implementations • 10 Oct 2022 • Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington

Recent studies of gradient descent with large step sizes have shown that there is often a regime with an initial increase in the largest eigenvalue of the loss Hessian (progressive sharpening), followed by a stabilization of the eigenvalue near the maximum value which allows convergence (edge of stability).

regression

Paper
Add Code

The Curse of Unrolling: Rate of Differentiating Through Optimization

no code implementations • 27 Sep 2022 • Damien Scieur, Quentin Bertrand, Gauthier Gidel, Fabian Pedregosa

Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few.

Hyperparameter Optimization Meta-Learning +1

Paper
Add Code

Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

no code implementations • 20 Jun 2022 • Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette

The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results.

Paper
Add Code

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

no code implementations • 24 Feb 2022 • Robert M. Gower, Mathieu Blondel, Nidham Gazagnadou, Fabian Pedregosa

We use this insight to develop new variants of the SPS method that are better suited to nonlinear models.

Paper
Add Code

GradMax: Growing Neural Networks using Gradient Information

1 code implementation • ICLR 2022 • Utku Evci, Bart van Merriënboer, Thomas Unterthiner, Max Vladymyrov, Fabian Pedregosa

The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified.

Paper
Code

Efficient and Modular Implicit Differentiation

1 code implementation • NeurIPS 2021 • Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert

In this paper, we propose automatic implicit differentiation, an efficient and modular approach for implicit differentiation of optimization problems.

Meta-Learning

891

Paper
Code

Boosting Variational Inference With Locally Adaptive Step-Sizes

no code implementations • 19 May 2021 • Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch

Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution.

Variational Inference

Paper
Add Code

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation • 17 Feb 2021 • Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

Paper
Code

SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

no code implementations • 8 Feb 2021 • Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette

We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large.

Paper
Add Code

Average-case Acceleration for Bilinear Games and Normal Matrices

no code implementations • ICLR 2021 • Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

First, we show that for zero-sum bilinear games the average-case optimal method is the optimal method for the minimization of the Hamiltonian.

Paper
Add Code

Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

no code implementations • 8 Jun 2020 • Courtney Paquette, Bart van Merriënboer, Elliot Paquette, Fabian Pedregosa

In fact, the halting time exhibits a universality property: it is independent of the probability distribution.

Paper
Add Code

Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization

1 code implementation • ICML 2020 • Geoffrey Négiar, Gideon Dresdner, Alicia Tsai, Laurent El Ghaoui, Francesco Locatello, Robert M. Freund, Fabian Pedregosa

We propose a novel Stochastic Frank-Wolfe (a. k. a.

134

Paper
Code

The Geometry of Sign Gradient Descent

no code implementations • ICLR 2020 • Lukas Balles, Fabian Pedregosa, Nicolas Le Roux

Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training.

Distributed Optimization

Paper
Add Code

Average-case Acceleration Through Spectral Density Estimation

no code implementations • 12 Feb 2020 • Fabian Pedregosa, Damien Scieur

We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis.

Density Estimation regression

Paper
Add Code

A Test for Shared Patterns in Cross-modal Brain Activation Analysis

1 code implementation • 8 Oct 2019 • Elena Kalinina, Fabian Pedregosa, Vittorio Iacovella, Emanuele Olivetti, Paolo Avesani

In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem.

Two-sample testing

Paper
Code

The Difficulty of Training Sparse Neural Networks

no code implementations • ICML Workshop Deep_Phenomen 2019 • Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen

Additionally, our attempts to find a decreasing objective path from "bad" solutions to the "good" ones in the sparse subspace fail.

Paper
Add Code

On the interplay between noise and curvature and its effect on optimization and generalization

no code implementations • 18 Jun 2019 • Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.

Paper
Add Code

Variance Reduced Three Operator Splitting

1 code implementation • 19 Jun 2018 • Fabian Pedregosa, Kilian Fatras, Mattia Casotto

This is due to the fact that existing methods require to evaluate the proximity operator for the nonsmooth terms, which can be a costly operation for complex penalties.

Optimization and Control 65K10

Paper
Code

Frank-Wolfe Splitting via Augmented Lagrangian Method

no code implementations • 9 Apr 2018 • Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien

In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints.

Paper
Add Code

Adaptive Three Operator Splitting

no code implementations • ICML 2018 • Fabian Pedregosa, Gauthier Gidel

We propose and analyze an adaptive step-size variant of the Davis-Yin three operator splitting.

Paper
Add Code

Frank-Wolfe with Subsampling Oracle

no code implementations • ICML 2018 • Thomas Kerdreux, Fabian Pedregosa, Alexandre d'Aspremont

The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart.

Paper
Add Code

Improved asynchronous parallel optimization analysis for stochastic incremental methods

no code implementations • 11 Jan 2018 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

Notably, we prove that ASAGA and KROMAGNON can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions.

Paper
Add Code

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

1 code implementation • NeurIPS 2017 • Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien

Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures.

Paper
Code

On the convergence rate of the three operator splitting scheme

no code implementations • 25 Oct 2016 • Fabian Pedregosa

The three operator splitting scheme was recently proposed by [Davis and Yin, 2015] as a method to optimize composite objective functions with one convex smooth term and two convex (possibly non-smooth) terms for which we have access to their proximity operator.

Paper
Add Code

ASAGA: Asynchronous Parallel SAGA

1 code implementation • 15 Jun 2016 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates.

Paper
Code

Hyperparameter optimization with approximate gradient

1 code implementation • 7 Feb 2016 • Fabian Pedregosa

Most models in machine learning contain at least one hyperparameter to control for model complexity.

Hyperparameter Optimization regression

Paper
Code

Machine Learning for Neuroimaging with Scikit-Learn

1 code implementation • 12 Dec 2014 • Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Muller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gäel Varoquaux

Statistical machine learning methods are increasingly used for neuroimaging data analysis.

BIG-bench Machine Learning Time Series +1

Paper
Code

On the Consistency of Ordinal Regression Methods

no code implementations • 11 Aug 2014 • Fabian Pedregosa, Francis Bach, Alexandre Gramfort

We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification.

Binary Classification General Classification +1

Paper
Add Code

Data-driven HRF estimation for encoding and decoding models

no code implementations • 27 Feb 2014 • Fabian Pedregosa, Michael Eickenberg, Philippe Ciuciu, Bertrand Thirion, Alexandre Gramfort

We develop a method for the joint estimation of activation and HRF using a rank constraint causing the estimated HRF to be equal across events/conditions, yet permitting it to be different across voxels.

Computational Efficiency

Paper
Add Code

API design for machine learning software: experiences from the scikit-learn project

4 code implementations • 1 Sep 2013 • Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, Gaël Varoquaux

Scikit-learn is an increasingly popular machine learning li- brary.

BIG-bench Machine Learning

Paper
Code

Second order scattering descriptors predict fMRI activity due to visual textures

no code implementations • 10 Aug 2013 • Michael Eickenberg, Fabian Pedregosa, Senoussi Mehdi, Alexandre Gramfort, Bertrand Thirion

Second layer scattering descriptors are known to provide good classification performance on natural quasi-stationary processes such as visual textures due to their sensitivity to higher order moments and continuity with respect to small deformations.

General Classification

Paper
Add Code

HRF estimation improves sensitivity of fMRI encoding and decoding models

no code implementations • 13 May 2013 • Fabian Pedregosa, Michael Eickenberg, Bertrand Thirion, Alexandre Gramfort

Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal.

Paper
Add Code

Scikit-learn: Machine Learning in Python

3 code implementations • 2 Jan 2012 • Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

BIG-bench Machine Learning Clustering +3

58,255

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.