Search Results for author: Zheyuan Bai

Found 4 papers, 2 papers with code

Rethinking Optimization and Architecture for Tiny Language Models

1 code implementation • 5 Feb 2024 • Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang

Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.

Language Modelling

Paper
Code

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

Paper
Add Code

Data-Free Distillation of Language Model by Text-to-Text Transfer

no code implementations • 3 Nov 2023 • Zheyuan Bai, Xinduo Liu, Hailin Hu, Tianyu Guo, Qinghua Zhang, Yunhe Wang

Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when original training data is unavailable.

Data-free Knowledge Distillation Language Modelling +4

Paper
Add Code

Multiscale Positive-Unlabeled Detection of AI-Generated Texts

3 code implementations • 29 May 2023 • Yuchuan Tian, Hanting Chen, Xutao Wang, Zheyuan Bai, Qinghua Zhang, Ruifeng Li, Chao Xu, Yunhe Wang

Recent releases of Large Language Models (LLMs), e. g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts.

Language Modelling text-classification +2

1,116

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.