Search Results for author: Chuning Li

Found 1 papers, 0 papers with code

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

no code implementations • NeurIPS 2023 • Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.

Deep Attention Learning Theory

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.