no code implementations • 23 Jul 2023 • Guan Shen, Jieru Zhao, Zeke Wang, Zhe Lin, Wenchao Ding, Chentao Wu, Quan Chen, Minyi Guo
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly.
no code implementations • 29 Jun 2022 • Guan Shen, Jieru Zhao, Quan Chen, Jingwen Leng, Chao Li, Minyi Guo
However, the quadratic complexity of self-attention w. r. t the sequence length incurs heavy computational and memory burdens, especially for tasks with long sequences.