Search Results for author: Giulio Biroli

Found 34 papers, 12 papers with code

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

no code implementations • ICML 2020 • Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

We demonstrate that the latter two contributions are the crux of the double descent: they lead to the overfitting peak at the interpolation threshold and to the decay of the test error upon overparametrization.

Paper
Add Code

Cascade of phase transitions in the training of Energy-based models

no code implementations • 23 May 2024 • Dimitrios Bachtis, Giulio Biroli, Aurélien Decelle, Beatriz Seoane

We first describe this process analytically in a controlled setup that allows us to study analytically the training dynamics.

Paper
Add Code

From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima

no code implementations • 4 Mar 2024 • Tony Bonnaire, Giulio Biroli, Chiara Cammarota

Through both theoretical analysis and numerical experiments, we show that in practical cases, i. e. for finite but even very large $N$, successful optimization via gradient descent in phase retrieval is achieved by falling towards the good minima before reaching the bad ones.

Retrieval

Paper
Add Code

Dynamical Regimes of Diffusion Models

no code implementations • 28 Feb 2024 • Giulio Biroli, Tony Bonnaire, Valentin De Bortoli, Marc Mézard

Using statistical physics methods, we study generative diffusion models in the regime where the dimension of space and the number of data are large, and the score function has been trained optimally.

Paper
Add Code

On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions

no code implementations • 7 Nov 2023 • Simon Martin, Francis Bach, Giulio Biroli

We study the training dynamics of a shallow neural network with quadratic activation functions and quadratic cost in a teacher-student setup.

Paper
Add Code

Interactions and migration rescuing ecological diversity

no code implementations • 18 Sep 2023 • Giulia Garcia Lorenzana, Ada Altieri, Giulio Biroli

Here we show that the case of many species with heterogeneous interactions is different and richer.

Paper
Add Code

Wavelet Conditional Renormalization Group

no code implementations • 11 Jul 2022 • Tanguy Marchand, Misaki Ozawa, Giulio Biroli, Stéphane Mallat

We develop a multiscale approach to estimate high-dimensional probability distributions from a dataset of physical fields or configurations observed in experiments or simulations.

Paper
Add Code

Optimal learning rate schedules in high-dimensional non-convex optimization problems

no code implementations • 9 Feb 2022 • Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli

In this case, it is optimal to keep a large learning rate during the exploration phase to escape the non-convex region as quickly as possible, then use the convex criterion $\beta=1$ to converge rapidly to the solution.

Scheduling Vocal Bursts Intensity Prediction

Paper
Add Code

Artificial selection of communities drives the emergence of structured interactions

no code implementations • 13 Dec 2021 • Jules Fraboul, Giulio Biroli, Silvia De Monte

Our analytical and numerical results reveal that selection for scalar community functions leads to the emergence, along an evolutionary trajectory, of a low-dimensional structure in an initially featureless interaction matrix.

Paper
Add Code

Transformed CNNs: recasting pre-trained convolutional layers with self-attention

no code implementations • 10 Jun 2021 • Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos

Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.

Paper
Add Code

Sifting out the features by pruning: Are convolutional networks the winning lottery ticket of fully connected ones?

1 code implementation • 27 Apr 2021 • Franco Pellegrini, Giulio Biroli

Our results show that the winning lottery tickets of FCNs display the key features of CNNs.

Inductive Bias

Paper
Code

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

9 code implementations • 19 Mar 2021 • Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun

We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.

Ranked #483 on Image Classification on ImageNet

Image Classification Inductive Bias

30,258

Paper
Code

On the interplay between data structure and loss function in classification problems

1 code implementation • NeurIPS 2021 • Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli

One of the central puzzles in modern machine learning is the ability of heavily overparametrized models to generalize well.

valid

Paper
Code

Elastoplasticity Mediates Dynamical Heterogeneity Below the Mode-Coupling Temperature

1 code implementation • 2 Mar 2021 • Rahul N. Chacko François P. Landes, Giulio Biroli, Olivier Dauchot, Andrea J. Liu, David R. Reichman

As liquids approach the glass transition temperature, dynamical heterogeneity emerges as a crucial universal feature of their behavior.

Soft Condensed Matter Statistical Mechanics Chemical Physics

Paper
Code

Rare events and disorder control the brittle yielding of well-annealed amorphous solids

no code implementations • 11 Feb 2021 • Misaki Ozawa, Ludovic Berthier, Giulio Biroli, Gilles Tarjus

We use atomistic computer simulations to provide a microscopic description of the brittle failure of amorphous materials, and we assess the role of rare events and quenched disorder.

Soft Condensed Matter Disordered Systems and Neural Networks Materials Science

Paper
Add Code

Amorphous Order & Non-linear Susceptibilities in Glassy Materials

no code implementations • 11 Jan 2021 • Giulio Biroli, Jean-Philippe Bouchaud, Francois Ladieu

We review 15 years of theoretical and experimental work on the non-linear response of glassy systems.

Disordered Systems and Neural Networks Soft Condensed Matter Statistical Mechanics

Paper
Add Code

The Lévy-Rosenzweig-Porter random matrix ensemble

no code implementations • 23 Dec 2020 • Giulio Biroli, Marco Tarzia

The idea is that the energy spreading of the mini-bands can be determined self-consistently by requiring that the maximum of the matrix elements between a site $i$ and the other $N^{D_1}$ sites of the support set is of the same order of the Thouless energy itself $N^{D_1 - 1}$.

Disordered Systems and Neural Networks Quantum Gases Statistical Mechanics

Paper
Add Code

An analytic theory of shallow networks dynamics for hinge loss classification

1 code implementation • NeurIPS 2020 • Franco Pellegrini, Giulio Biroli

Neural networks have been shown to perform incredibly well in classification tasks over structured high-dimensional datasets.

General Classification

Paper
Code

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

no code implementations • NeurIPS 2020 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.

Retrieval

Paper
Add Code

Triple descent and the two kinds of overfitting: Where & why do they appear?

1 code implementation • NeurIPS 2020 • Stéphane d'Ascoli, Levent Sagun, Giulio Biroli

We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization.

regression

Paper
Code

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime

2 code implementations • 2 Mar 2020 • Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

We obtain a precise asymptotic expression for the bias-variance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond which it remains constant.

Paper
Code

Landscape Complexity for the Empirical Risk of Generalized Linear Models

no code implementations • 4 Dec 2019 • Antoine Maillard, Gérard Ben Arous, Giulio Biroli

Under a technical hypothesis, we obtain a rigorous explicit variational formula for the annealed complexity, which is the logarithm of the average number of critical points at fixed value of the empirical risk.

Paper
Add Code

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

1 code implementation • NeurIPS 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.

Paper
Code

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

no code implementations • 18 Jul 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Paper
Add Code

Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias

1 code implementation • NeurIPS 2019 • Stéphane d'Ascoli, Levent Sagun, Joan Bruna, Giulio Biroli

The aim of this work is to understand this fact through the lens of dynamics in the loss landscape.

Navigate

Paper
Code

Attractive versus truncated repulsive supercooled liquids: The dynamics is encoded in the pair correlation function

no code implementations • 3 Jun 2019 • François P. Landes, Giulio Biroli, Olivier Dauchot, Andrea J. Liu, David R. Reichman

We compare glassy dynamics in two liquids that differ in the form of their interaction potentials.

BIG-bench Machine Learning

Paper
Add Code

How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA

no code implementations • 29 May 2019 • Giulio Biroli, Chiara Cammarota, Federico Ricci-Tersenghi

In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated.

Paper
Add Code

Scaling description of generalization with number of parameters in deep learning

1 code implementation • 6 Jan 2019 • Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart

At this threshold, we argue that $\|f_{N}\|$ diverges.

1,193

Paper
Code

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

no code implementations • 21 Dec 2018 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.

Paper
Add Code

A jamming transition from under- to over-parametrization affects loss landscape and generalization

no code implementations • 22 Oct 2018 • Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.

Paper
Add Code

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

2 code implementations • 25 Sep 2018 • Mario Geiger, Stefano Spigler, Stéphane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart

In the vicinity of this transition, properties of the curvature of the minima of the loss are critical.

Paper
Code

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

Paper
Add Code

Complex energy landscapes in spiked-tensor and simple glassy models: ruggedness, arrangements of local minima and phase transitions

no code implementations • 8 Apr 2018 • Valentina Ros, Gerard Ben Arous, Giulio Biroli, Chiara Cammarota

We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges.

Paper
Add Code

Finite size effects in the dynamics of glass-forming liquids

1 code implementation • 15 Mar 2012 • Ludovic Berthier, Giulio Biroli, Daniele Coslovich, Walter Kob, Cristina Toninelli

We present a comprehensive theoretical study of finite size effects in the relaxation dynamics of glass-forming liquids.

Statistical Mechanics

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.