Search Results for author: Tamra Nebabu

Found 1 papers, 0 papers with code

Geometric Dynamics of Signal Propagation Predict Trainability of Transformers

no code implementations5 Mar 2024 Aditya Cowsik, Tamra Nebabu, Xiao-Liang Qi, Surya Ganguli

Our update equations show that without MLP layers, this system will collapse to a line, consistent with prior work on rank collapse in transformers.

Cannot find the paper you are looking for? You can Submit a new open access paper.