1 code implementation • 14 Feb 2024 • Xiongye Xiao, Chenyu Zhou, Heng Ping, Defu Cao, Yaxing Li, Yizhuo Zhou, Shixuan Li, Paul Bogdan
Prior studies on the emergence in large models have primarily focused on how the functional capabilities of large language models (LLMs) scale with model size.
1 code implementation • 20 Dec 2023 • Anzhe Cheng, Zhenkun Wang, Chenzhong Yin, Mingxi Cheng, Heng Ping, Xiongye Xiao, Shahin Nazarian, Paul Bogdan
This includes decisions on how to decouple network blocks and which auxiliary networks to use for each block.
no code implementations • 9 Dec 2023 • Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng Ping, Chenyu Zhou, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin Nazarian, Paul Bogdan
We compare our framework with existing state-of-the-art models and show that it is more efficient with respect to speed and computational usage, as a result of the decrement in training steps and its applicability to models with fewer parameters.