Search Results for author: Chuning Li

Found 1 papers, 0 papers with code

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

no code implementations NeurIPS 2023 Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.

Deep Attention Learning Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.