Search Results for author: Aditya Varre

Found 4 papers, 2 papers with code

Why Do We Need Weight Decay in Modern Deep Learning?

1 code implementation6 Oct 2023 Maksym Andriushchenko, Francesco D'Angelo, Aditya Varre, Nicolas Flammarion

In this work, we highlight that the role of weight decay in modern deep learning is different from its regularization effect studied in classical learning theory.

Learning Theory Stochastic Optimization

SGD with Large Step Sizes Learns Sparse Features

1 code implementation11 Oct 2022 Maksym Andriushchenko, Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion

We present empirical observations that commonly used large step sizes (i) lead the iterates to jump from one side of a valley to the other causing loss stabilization, and (ii) this stabilization induces a hidden stochastic dynamics orthogonal to the bouncing directions that biases it implicitly toward sparse predictors.

Accelerated SGD for Non-Strongly-Convex Least Squares

no code implementations3 Mar 2022 Aditya Varre, Nicolas Flammarion

We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting.

regression

Last iterate convergence of SGD for Least-Squares in the Interpolation regime

no code implementations NeurIPS 2021 Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion

Motivated by the recent successes of neural networks that have the ability to fit the data perfectly and generalize well, we study the noiseless model in the fundamental least-squares setup.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.