no code implementations • 6 Dec 2023 • Amirhesam Abedsoltan, Parthe Pandit, Luis Rademacher, Mikhail Belkin
Scalable algorithms for learning kernel models need to be iterative in nature, but convergence can be slow due to poor conditioning.
no code implementations • 5 Jun 2023 • Chaoyue Liu, Amirhesam Abedsoltan, Mikhail Belkin
This behaviour is believed to be a result of neural networks learning the pattern of clean data first and fitting the noise later in the training, a phenomenon that we refer to as clean-priority learning.
1 code implementation • 6 Feb 2023 • Amirhesam Abedsoltan, Mikhail Belkin, Parthe Pandit
Recent studies indicate that kernel machines can often perform similarly or better than deep neural networks (DNNs) on small datasets.
no code implementations • 14 Jul 2022 • Neil Mallinar, James B. Simon, Amirhesam Abedsoltan, Parthe Pandit, Mikhail Belkin, Preetum Nakkiran
In this work we argue that while benign overfitting has been instructive and fruitful to study, many real interpolating methods like neural networks do not fit benignly: modest noise in the training set causes nonzero (but non-infinite) excess risk at test time, implying these models are neither benign nor catastrophic but rather fall in an intermediate regime.