no code implementations • 27 Apr 2024 • Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding
Due to their infrequent appearance in the text corpus, Scaffold Tokens pose a learning imbalance issue for language models.
no code implementations • 27 Apr 2024 • Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Jianwei Niu, Guiguang Ding
We first investigate the imbalance of loss on each token positions and develop a reciprocal-law across model scales and training stages.
no code implementations • 14 Mar 2024 • Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.
1 code implementation • ICCV 2023 • Yizhe Xiong, Hui Chen, Zijia Lin, Sicheng Zhao, Guiguang Ding
To address this issue, recent works consider the Few-shot Unsupervised Domain Adaptation (FUDA) where only a few source samples are labeled, and conduct knowledge transfer via self-supervised learning methods.