1 code implementation • 8 Nov 2023 • Ze-Feng Gao, Shuai Qu, Bocheng Zeng, Yang Liu, Ji-Rong Wen, Hao Sun, Peng-Jie Guo, Zhong-Yi Lu
Altermagnetism, a new magnetic phase, has been theoretically proposed and experimentally verified to be distinct from ferromagnetism and antiferromagnetism.
1 code implementation • 16 Jul 2023 • Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen
Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models.
no code implementations • 27 Mar 2023 • Peiyu Liu, Ze-Feng Gao, Yushuo Chen, Wayne Xin Zhao, Ji-Rong Wen
Based on such a decomposition, our architecture shares the central tensor across all layers for reducing the model size and meanwhile keeps layer-specific auxiliary tensors (also using adapters) for enhancing the adaptation flexibility.
2 code implementations • COLING 2022 • Ze-Feng Gao, Peiyu Liu, Wayne Xin Zhao, Zhong-Yi Lu, Ji-Rong Wen
Recently, Mixture-of-Experts (short as MoE) architecture has achieved remarkable success in increasing the model capacity of large-scale language models.
no code implementations • 29 Sep 2021 • Ze-Feng Gao, Peiyu Liu, Xiao-Hui Zhang, Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen
Based on the MPS structure, we propose a new dataset compression method that compresses datasets by filtering long-range correlation information in task-agnostic scenarios and uses dataset distillation to supplement the information in task-specific scenarios.
1 code implementation • ACL 2021 • Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen
This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics.
no code implementations • 22 Dec 2020 • Ze-Feng Gao, Xingwei Sun, Lan Gao, Junfeng Li, Zhong-Yi Lu
In this paper, we propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
Networking and Internet Architecture Computational Physics Quantum Physics
no code implementations • 10 Oct 2020 • Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan
In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement.
1 code implementation • 11 Apr 2019 • Ze-Feng Gao, Song Cheng, Rong-Qiang He, Z. Y. Xie, Hui-Hai Zhao, Zhong-Yi Lu, Tao Xiang
A deep neural network is a parametrization of a multilayer mapping of signals in terms of many alternatively arranged linear and nonlinear transformations.