no code implementations • 2 Nov 2023 • Ke Hong, Guohao Dai, Jiaming Xu, Qiuli Mao, Xiuhong Li, Jun Liu, Kangdi Chen, Yuhan Dong, Yu Wang
A single and static dataflow may lead to a 50. 25% performance loss for GEMMs of different shapes in LLM inference.