Search Results for author: Ze-Feng Gao

Found 9 papers, 5 papers with code

AI-accelerated Discovery of Altermagnetic Materials

1 code implementation8 Nov 2023 Ze-Feng Gao, Shuai Qu, Bocheng Zeng, Yang Liu, Ji-Rong Wen, Hao Sun, Peng-Jie Guo, Zhong-Yi Lu

Altermagnetism, a new magnetic phase, has been theoretically proposed and experimentally verified to be distinct from ferromagnetism and antiferromagnetism.

Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

1 code implementation16 Jul 2023 Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models.

In-Context Learning Instruction Following +1

Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture

no code implementations27 Mar 2023 Peiyu Liu, Ze-Feng Gao, Yushuo Chen, Wayne Xin Zhao, Ji-Rong Wen

Based on such a decomposition, our architecture shares the central tensor across all layers for reducing the model size and meanwhile keeps layer-specific auxiliary tensors (also using adapters) for enhancing the adaptation flexibility.

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models

2 code implementations COLING 2022 Ze-Feng Gao, Peiyu Liu, Wayne Xin Zhao, Zhong-Yi Lu, Ji-Rong Wen

Recently, Mixture-of-Experts (short as MoE) architecture has achieved remarkable success in increasing the model capacity of large-scale language models.

Language Modelling Multi-Task Learning +2

Image Dataset Compression Based on Matrix Product States

no code implementations29 Sep 2021 Ze-Feng Gao, Peiyu Liu, Xiao-Hui Zhang, Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen

Based on the MPS structure, we propose a new dataset compression method that compresses datasets by filtering long-range correlation information in task-agnostic scenarios and uses dataset distillation to supplement the information in task-specific scenarios.

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

1 code implementation ACL 2021 Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen

This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics.

Language Modelling Model Compression

Compressing LSTM Networks by Matrix Product Operators

no code implementations22 Dec 2020 Ze-Feng Gao, Xingwei Sun, Lan Gao, Junfeng Li, Zhong-Yi Lu

In this paper, we propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.

Networking and Internet Architecture Computational Physics Quantum Physics

A Model Compression Method with Matrix Product Operators for Speech Enhancement

no code implementations10 Oct 2020 Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan

In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement.

Model Compression Speech Enhancement

Compressing deep neural networks by matrix product operators

1 code implementation11 Apr 2019 Ze-Feng Gao, Song Cheng, Rong-Qiang He, Z. Y. Xie, Hui-Hai Zhao, Zhong-Yi Lu, Tao Xiang

A deep neural network is a parametrization of a multilayer mapping of signals in terms of many alternatively arranged linear and nonlinear transformations.

Cannot find the paper you are looking for? You can Submit a new open access paper.