1 code implementation • 29 Mar 2024 • Hanting Chen, Zhicheng Liu, Xutao Wang, Yuchuan Tian, Yunhe Wang
In an effort to reduce the computational load of Transformers, research on linear attention has gained significant momentum.
1 code implementation • 5 Feb 2024 • Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang
Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.
1 code implementation • NeurIPS 2023 • Yuchuan Tian, Hanting Chen, Tianyu Guo, Chao Xu, Yunhe Wang
To this end, we propose a Rank-based PruninG (RPG) method to maintain the ranks of sparse weights in an adversarial manner.
3 code implementations • 29 May 2023 • Yuchuan Tian, Hanting Chen, Xutao Wang, Zheyuan Bai, Qinghua Zhang, Ruifeng Li, Chao Xu, Yunhe Wang
Recent releases of Large Language Models (LLMs), e. g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts.
no code implementations • 29 Sep 2021 • Xiaochen Zhou, Yuchuan Tian, Xudong Wang
Moreover, to prevent the compact model from forgetting the knowledge of the source data during knowledge distillation, a collaborative knowledge distillation (Co-KD) method is developed to unify the source data on the server and the target data on the edge device to train the compact model.