Search Results for author: Yizhe Xiong

Found 4 papers, 1 papers with code

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

no code implementations • 27 Apr 2024 • Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding

Due to their infrequent appearance in the text corpus, Scaffold Tokens pose a learning imbalance issue for language models.

Language Modelling Machine Translation

Paper
Add Code

Temporal Scaling Law for Large Language Models

no code implementations • 27 Apr 2024 • Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Jianwei Niu, Guiguang Ding

We first investigate the imbalance of loss on each token positions and develop a reciprocal-law across model scales and training stages.

Paper
Add Code

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

no code implementations • 14 Mar 2024 • Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding

Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.

Model Compression

Paper
Add Code

Confidence-based Visual Dispersal for Few-shot Unsupervised Domain Adaptation

1 code implementation • ICCV 2023 • Yizhe Xiong, Hui Chen, Zijia Lin, Sicheng Zhao, Guiguang Ding

To address this issue, recent works consider the Few-shot Unsupervised Domain Adaptation (FUDA) where only a few source samples are labeled, and conduct knowledge transfer via self-supervised learning methods.

Self-Supervised Learning Transfer Learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.