Search Results for author: Atsushi Yamamura

Found 2 papers, 1 papers with code

Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks

1 code implementation NeurIPS 2023 Feng Chen, Daniel Kunin, Atsushi Yamamura, Surya Ganguli

In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of independent parameters, and improving generalization.

The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

no code implementations7 Oct 2022 Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli

We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while structured enough to enable geometric analysis of its gradient dynamics.

Cannot find the paper you are looking for? You can Submit a new open access paper.