1 code implementation • 20 Mar 2024 • Charles C. Margossian, Loucas Pillaud-Vivien, Lawrence K. Saul
Our analysis covers the KL divergence, the R\'enyi divergences, and a score-based divergence that compares $\nabla\log p$ and $\nabla\log q$.
no code implementations • 8 Mar 2024 • Alex Damian, Loucas Pillaud-Vivien, Jason D. Lee, Joan Bruna
Single-Index Models are high-dimensional regression problems with planted structure, whereby labels depend on an unknown one-dimensional projection of the input via a generic, non-linear, and potentially non-deterministic transformation.
no code implementations • 22 Feb 2024 • Diana Cai, Chirag Modi, Loucas Pillaud-Vivien, Charles C. Margossian, Robert M. Gower, David M. Blei, Lawrence K. Saul
We analyze the convergence of BaM when the target distribution is Gaussian, and we prove that in the limit of infinite batch size the variational parameter updates converge exponentially quickly to the target mean and covariance.
no code implementations • 30 Oct 2023 • Alberto Bietti, Joan Bruna, Loucas Pillaud-Vivien
We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data.
no code implementations • 28 Jul 2023 • Joan Bruna, Loucas Pillaud-Vivien, Aaron Zweig
Sparse high-dimensional functions have arisen as a rich framework to study the behavior of gradient-descent methods using shallow neural networks, showcasing their ability to perform feature learning beyond linear models.
1 code implementation • 13 Feb 2023 • Loucas Pillaud-Vivien, Francis Bach
Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data.
1 code implementation • 11 Oct 2022 • Maksym Andriushchenko, Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion
We present empirical observations that commonly used large step sizes (i) lead the iterates to jump from one side of a valley to the other causing loss stabilization, and (ii) this stabilization induces a hidden stochastic dynamics orthogonal to the bouncing directions that biases it implicitly toward sparse predictors.
no code implementations • 20 Jun 2022 • Loucas Pillaud-Vivien, Julien Reygner, Nicolas Flammarion
Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks.
1 code implementation • 2 Jun 2022 • Etienne Boursier, Loucas Pillaud-Vivien, Nicolas Flammarion
The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution.
no code implementations • NeurIPS 2021 • Scott Pesme, Loucas Pillaud-Vivien, Nicolas Flammarion
Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks.
no code implementations • NeurIPS 2021 • Aditya Vardhan Varre, Loucas Pillaud-Vivien, Nicolas Flammarion
Motivated by the recent successes of neural networks that have the ability to fit the data perfectly \emph{and} generalize well, we study the noiseless model in the fundamental least-squares setup.
no code implementations • NeurIPS 2021 • Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion
Motivated by the recent successes of neural networks that have the ability to fit the data perfectly and generalize well, we study the noiseless model in the fundamental least-squares setup.
2 code implementations • NeurIPS 2021 • Vivien Cabannes, Loucas Pillaud-Vivien, Francis Bach, Alessandro Rudi
As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning.
no code implementations • NeurIPS 2018 • Loucas Pillaud-Vivien, Alessandro Rudi, Francis Bach
We consider stochastic gradient descent (SGD) for least-squares regression with potentially several passes over the data.
no code implementations • 13 Dec 2017 • Loucas Pillaud-Vivien, Alessandro Rudi, Francis Bach
We consider binary classification problems with positive definite kernels and square loss, and study the convergence rates of stochastic gradient methods.