Search Results for author: Cengiz Pehlevan

Found 50 papers, 23 papers with code

A Dynamical Model of Neural Scaling Laws

no code implementations2 Feb 2024 Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan

On a variety of tasks, the performance of neural networks predictably improves with training time, dataset size and model size across many orders of magnitude.

Grokking as the Transition from Lazy to Rich Training Dynamics

no code implementations9 Oct 2023 Tanishq Kumar, Blake Bordelon, Samuel J. Gershman, Cengiz Pehlevan

We identify sufficient statistics for the test loss of such a network, and tracking these over training reveals that grokking arises in this setting when the network first attempts to fit a kernel regression solution with its initial features, followed by late-time feature learning where a generalizing solution is identified after train loss is already low.

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

no code implementations28 Sep 2023 Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan

We provide experiments demonstrating that residual architectures including convolutional ResNets and Vision Transformers trained with this parameterization exhibit transfer of optimal hyperparameters across width and depth on CIFAR-10 and ImageNet.

Loss Dynamics of Temporal Difference Reinforcement Learning

1 code implementation NeurIPS 2023 Blake Bordelon, Paul Masset, Henry Kuo, Cengiz Pehlevan

We study how learning dynamics and plateaus depend on feature structure, learning rate, discount factor, and reward function.

reinforcement-learning

Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles

1 code implementation NeurIPS 2023 Benjamin S. Ruben, Cengiz Pehlevan

Feature bagging is a well-established ensembling method which aims to reduce prediction variance by combining predictions of many estimators trained on subsets or projections of features.

Image Classification

Long Sequence Hopfield Memory

1 code implementation NeurIPS 2023 Hamza Tahir Chaudhry, Jacob A. Zavatone-Veth, Dmitry Krotov, Cengiz Pehlevan

Sequence memory is an essential attribute of natural and artificial intelligence that enables agents to encode, store, and retrieve complex sequences of stimuli and actions.

Attribute

Correlative Information Maximization: A Biologically Plausible Approach to Supervised Deep Neural Networks without Weight Symmetry

1 code implementation NeurIPS 2023 Bariscan Bozkurt, Cengiz Pehlevan, Alper T Erdogan

Furthermore, our approach provides a natural resolution to the weight symmetry problem between forward and backward signal propagation paths, a significant critique against the plausibility of the conventional backpropagation algorithm.

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks

1 code implementation NeurIPS 2023 Blake Bordelon, Cengiz Pehlevan

However, in the rich, feature learning regime, the fluctuations of the kernels and predictions are dynamically coupled with a variance that can be computed self-consistently.

Neural networks learn to magnify areas near decision boundaries

1 code implementation26 Jan 2023 Jacob A. Zavatone-Veth, Sheng Yang, Julian A. Rubinfien, Cengiz Pehlevan

This holds in deep networks trained on high-dimensional image classification tasks, and even in self-supervised representation learning.

Image Classification Representation Learning

The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes

1 code implementation23 Dec 2022 Alexander Atanasov, Blake Bordelon, Sabarish Sainathan, Cengiz Pehlevan

For small training set sizes $P$, the generalization error of wide neural networks is well-approximated by the error of an infinite width neural network (NN), either in the kernel or mean-field/feature-learning regime.

regression

Correlative Information Maximization Based Biologically Plausible Neural Networks for Correlated Source Separation

1 code implementation9 Oct 2022 Bariscan Bozkurt, Ates Isfendiyaroglu, Cengiz Pehlevan, Alper T. Erdogan

Here, we relax this limitation and propose a biologically plausible neural network that extracts correlated latent sources by exploiting information about their domains.

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

no code implementations5 Oct 2022 Blake Bordelon, Cengiz Pehlevan

In the lazy limit, we find that DFA and Hebb can only learn using the last layer features, while full FA can utilize earlier layers with a scale determined by the initial correlation between feedforward and feedback weight matrices.

Biologically-Plausible Determinant Maximization Neural Networks for Blind Separation of Correlated Sources

2 code implementations27 Sep 2022 Bariscan Bozkurt, Cengiz Pehlevan, Alper T. Erdogan

Previous work on biologically-plausible BSS algorithms assumed that observed signals are linear mixtures of statistically independent or uncorrelated sources, limiting the domain of applicability of these algorithms.

blind source separation

Interneurons accelerate learning dynamics in recurrent neural networks for statistical adaptation

no code implementations21 Sep 2022 David Lipshutz, Cengiz Pehlevan, Dmitri B. Chklovskii

To this end, we consider two mathematically tractable recurrent linear neural networks that statistically whiten their inputs -- one with direct recurrent connections and the other with interneurons that mediate recurrent communication.

Bandwidth Enables Generalization in Quantum Kernel Models

no code implementations14 Jun 2022 Abdulkadir Canatar, Evan Peters, Cengiz Pehlevan, Stefan M. Wild, Ruslan Shaydulin

Quantum computers are known to provide speedups over classical state-of-the-art machine learning methods in some specialized settings.

Inductive Bias

Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks

no code implementations19 May 2022 Blake Bordelon, Cengiz Pehlevan

We analyze feature learning in infinite-width neural networks trained with gradient flow through a self-consistent dynamical field theory.

Contrasting random and learned features in deep Bayesian linear regression

no code implementations1 Mar 2022 Jacob A. Zavatone-Veth, William L. Tong, Cengiz Pehlevan

Moreover, we show that the leading-order correction to the kernel-limit learning curve cannot distinguish between random feature models and deep networks in which all layers are trained.

Learning Theory regression

On neural network kernels and the storage capacity problem

no code implementations12 Jan 2022 Jacob A. Zavatone-Veth, Cengiz Pehlevan

In this short note, we reify the connection between work on the storage capacity problem in wide two-layer treelike neural networks and the rapidly-growing body of literature on kernel limits of wide neural networks.

Depth induces scale-averaging in overparameterized linear Bayesian neural networks

no code implementations23 Nov 2021 Jacob A. Zavatone-Veth, Cengiz Pehlevan

Inference in deep Bayesian neural networks is only fully understood in the infinite-width limit, where the posterior flexibility afforded by increased depth washes out and the posterior predictive collapses to a shallow Gaussian process.

Representation Learning

Attention Approximates Sparse Distributed Memory

1 code implementation NeurIPS 2021 Trenton Bricken, Cengiz Pehlevan

While Attention has come to be an important mechanism in deep learning, there remains limited intuition for why it works so well.

Neural Networks as Kernel Learners: The Silent Alignment Effect

no code implementations ICLR 2022 Alexander Atanasov, Blake Bordelon, Cengiz Pehlevan

Can neural networks in the rich feature learning regime learn a kernel machine with a data-dependent kernel?

Out-of-Distribution Generalization in Kernel Regression

1 code implementation NeurIPS 2021 Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Here, we study generalization in kernel regression when the training and test distributions are different using methods from statistical physics.

BIG-bench Machine Learning Out-of-Distribution Generalization +1

Learning Curves for SGD on Structured Features

1 code implementation ICLR 2022 Blake Bordelon, Cengiz Pehlevan

To analyze the influence of data structure on test loss dynamics, we study an exactly solveable model of stochastic gradient descent (SGD) on mean square loss which predicts test loss when training on features with arbitrary covariance structure.

BIG-bench Machine Learning Feature Correlation +1

Asymptotics of representation learning in finite Bayesian neural networks

1 code implementation NeurIPS 2021 Jacob A. Zavatone-Veth, Abdulkadir Canatar, Benjamin S. Ruben, Cengiz Pehlevan

However, our theoretical understanding of how the learned hidden layer representations of finite networks differ from the fixed representations of infinite networks remains incomplete.

Representation Learning

Exact marginal prior distributions of finite Bayesian neural networks

1 code implementation NeurIPS 2021 Jacob A. Zavatone-Veth, Cengiz Pehlevan

For deep linear networks, the prior has a simple expression in terms of the Meijer $G$-function.

Biologically plausible single-layer networks for nonnegative independent component analysis

1 code implementation23 Oct 2020 David Lipshutz, Cengiz Pehlevan, Dmitri B. Chklovskii

To model how the brain performs this task, we seek a biologically plausible single-layer neural network implementation of a blind source separation algorithm.

blind source separation

Activation function dependence of the storage capacity of treelike neural networks

no code implementations21 Jul 2020 Jacob A. Zavatone-Veth, Cengiz Pehlevan

Though a wide variety of nonlinear activation functions have been proposed for use in artificial neural networks, a detailed understanding of their role in determining the expressive power of a network has not emerged.

Associative Memory in Iterated Overparameterized Sigmoid Autoencoders

no code implementations ICML 2020 Yibo Jiang, Cengiz Pehlevan

Recent work showed that overparameterized autoencoders can be trained to implement associative memory via iterative maps, when the trained input-output Jacobian of the network has all of its eigenvalue norms strictly below one.

Learning Theory regression

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

1 code implementation23 Jun 2020 Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

We present applications of our theory to real and synthetic datasets, and for many kernels including those that arise from training deep neural networks in the infinite-width limit.

BIG-bench Machine Learning Inductive Bias +1

Blind Bounded Source Separation Using Neural Networks with Local Learning Rules

1 code implementation11 Apr 2020 Alper T. Erdogan, Cengiz Pehlevan

An important problem encountered by both natural and engineered signal processing systems is blind source separation.

blind source separation

Contrastive Similarity Matching for Supervised Learning

no code implementations24 Feb 2020 Shanshan Qin, Nayantara Mudur, Cengiz Pehlevan

We propose a novel biologically-plausible solution to the credit assignment problem motivated by observations in the ventral visual pathway and trained deep neural networks.

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

1 code implementation ICML 2020 Blake Bordelon, Abdulkadir Canatar, Cengiz Pehlevan

We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples using theoretical methods from Gaussian processes and statistical physics.

Gaussian Processes regression

A Closer Look at Disentangling in $β$-VAE

no code implementations11 Dec 2019 Harshvardhan Sikka, Weishun Zhong, Jun Yin, Cengiz Pehlevan

In many data analysis tasks, it is beneficial to learn representations where each dimension is statistically independent and thus disentangled from the others.

Bayesian Inference Variational Inference

Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks

1 code implementation NeurIPS 2019 Dina Obeid, Hugo Ramambason, Cengiz Pehlevan

In single-layered and all-to-all connected neural networks, local plasticity has been shown to implement gradient-based learning on a class of cost functions that contain a term that aligns the similarity of outputs to the similarity of inputs.

Neuroscience-inspired online unsupervised learning algorithms

no code implementations5 Aug 2019 Cengiz Pehlevan, Dmitri B. Chklovskii

Although the currently popular deep learning networks achieve unprecedented performance on some tasks, the human brain still has a monopoly on general intelligence.

Clustering Dimensionality Reduction

A Spiking Neural Network with Local Learning Rules Derived From Nonnegative Similarity Matching

no code implementations4 Feb 2019 Cengiz Pehlevan

The design and analysis of spiking neural network algorithms will be accelerated by the advent of new theoretical approaches.

Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks

1 code implementation NeurIPS 2018 Anirvan Sengupta, Cengiz Pehlevan, Mariano Tepper, Alexander Genkin, Dmitri Chklovskii

Many neurons in the brain, such as place cells in the rodent hippocampus, have localized receptive fields, i. e., they respond to a small neighborhood of stimulus space.

Hippocampus

Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

no code implementations6 Aug 2018 Andrea Giovannucci, Victor Minden, Cengiz Pehlevan, Dmitri B. Chklovskii

Big data problems frequently require processing datasets in a streaming fashion, either because all data are available at once but collectively are larger than available memory or because the data intrinsically arrive one data point at a time and must be processed online.

Dimensionality Reduction

Blind nonnegative source separation using biological neural networks

no code implementations1 Jun 2017 Cengiz Pehlevan, Sreyas Mohan, Dmitri B. Chklovskii

Blind source separation, i. e. extraction of independent sources from a mixture, is an important problem for both artificial and natural signal processing.

blind source separation

Why do similarity matching objectives lead to Hebbian/anti-Hebbian networks?

no code implementations23 Mar 2017 Cengiz Pehlevan, Anirvan Sengupta, Dmitri B. Chklovskii

Modeling self-organization of neural networks for unsupervised learning using Hebbian and anti-Hebbian plasticity has a long history in neuroscience.

Dimensionality Reduction

Self-calibrating Neural Networks for Dimensionality Reduction

no code implementations11 Dec 2016 Yuansi Chen, Cengiz Pehlevan, Dmitri B. Chklovskii

Here we propose online algorithms where the threshold is self-calibrating based on the singular values computed from the existing observations.

Dimensionality Reduction

A Normative Theory of Adaptive Dimensionality Reduction in Neural Networks

no code implementations NeurIPS 2015 Cengiz Pehlevan, Dmitri B. Chklovskii

Here, we derive biologically plausible dimensionality reduction algorithms which adapt the number of output dimensions to the eigenspectrum of the input covariance matrix.

Dimensionality Reduction

Optimization theory of Hebbian/anti-Hebbian networks for PCA and whitening

no code implementations30 Nov 2015 Cengiz Pehlevan, Dmitri B. Chklovskii

Here, we focus on such workhorses of signal processing as Principal Component Analysis (PCA) and whitening which maximize information transmission in the presence of noise.

A Hebbian/Anti-Hebbian Network Derived from Online Non-Negative Matrix Factorization Can Cluster and Discover Sparse Features

2 code implementations2 Mar 2015 Cengiz Pehlevan, Dmitri B. Chklovskii

Despite our extensive knowledge of biophysical properties of neurons, there is no commonly accepted algorithmic theory of neuronal function.

Anatomy Clustering

A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data

no code implementations2 Mar 2015 Cengiz Pehlevan, Tao Hu, Dmitri B. Chklovskii

Such networks learn the principal subspace, in the sense of principal component analysis (PCA), by adjusting synaptic weights according to activity-dependent learning rules.

A Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization

no code implementations2 Mar 2015 Tao Hu, Cengiz Pehlevan, Dmitri B. Chklovskii

Here, to overcome this problem, we derive sparse dictionary learning from a novel cost-function - a regularized error of the symmetric factorization of the input's similarity matrix.

Dictionary Learning

A Neuron as a Signal Processing Device

no code implementations12 May 2014 Tao Hu, Zaid J. Towfic, Cengiz Pehlevan, Alex Genkin, Dmitri B. Chklovskii

Here we propose to view a neuron as a signal processing device that represents the incoming streaming data matrix as a sparse vector of synaptic weights scaled by an outgoing sparse activity vector.

Cannot find the paper you are looking for? You can Submit a new open access paper.