Search Results for author: Miaosen Zhang

Found 2 papers, 2 papers with code

Transformer as Linear Expansion of Learngene

1 code implementation9 Dec 2023 Shiyu Xia, Miaosen Zhang, Xu Yang, Ruiming Chen, Haokun Chen, Xin Geng

Under the situation where we need to produce models of varying depths adapting for different resource constraints, TLEG achieves comparable results while reducing around 19x parameters stored to initialize these models and around 5x pre-training costs, in contrast to the pre-training and fine-tuning approach.

Cannot find the paper you are looking for? You can Submit a new open access paper.