no code implementations • 24 Feb 2024 • Yong liu, Zirui Zhu, Chaoyu Gong, Minhao Cheng, Cho-Jui Hsieh, Yang You
While fine-tuning large language models (LLMs) for specific tasks often yields impressive results, it comes at the cost of memory inefficiency due to back-propagation in gradient-based training.
1 code implementation • 12 Apr 2021 • Qifan Xu, Shenggui Li, Chaoyu Gong, Yang You
However, due to memory constraints, model parallelism must be utilized to host large models that would otherwise not fit into the memory of a single device.