Search Results for author: Yequan Wang

Found 23 papers, 7 papers with code

Tele-FLM Technical Report

no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.

Language Modelling Large Language Model

Paper
Add Code

Not all Layers of LLMs are Necessary during Inference

no code implementations • 4 Mar 2024 • Siqi Fan, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Shuo Shang, Aixin Sun, Yequan Wang, Zhongyuan Wang

To answer this question, we first indicate that Not all Layers are Necessary during Inference by statistically analyzing the activated layers across tasks.

In-Context Learning

Paper
Add Code

Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint

no code implementations • 19 Feb 2024 • Xiaowei Yuan, Zhao Yang, Yequan Wang, Shengping Liu, Jun Zhao, Kang Liu

Large language models internalize enormous parametric knowledge during pre-training.

Paper
Add Code

Spectral-Based Graph Neural Networks for Complementary Item Recommendation

no code implementations • 4 Jan 2024 • Haitong Luo, Xuying Meng, Suhang Wang, Hanyun Cao, Weiyao Zhang, Yequan Wang, Yujun Zhang

In this study, we present a novel approach called Spectral-based Complementary Graph Neural Networks (SComGNN) that utilizes the spectral properties of complementary item graphs.

Attribute Recommendation Systems

Paper
Add Code

BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials

1 code implementation • 14 Dec 2023 • Xingrun Xing, Li Du, Xinyuan Wang, Xianlin Zeng, Yequan Wang, Zheng Zhang, Jiajun Zhang

Specifically, we first analyze the binarization error in self-attention operations and derive the polynomials of binarization error.

Binarization Natural Language Understanding

Paper
Code

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

no code implementations • 11 Sep 2023 • Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang, Xuezhi Fang

Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs.

Hallucination Instruction Following +2

Paper
Add Code

FLM-101B: An Open LLM and How to Train It with $100K Budget

no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang

We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.

Memorization

Paper
Add Code

Rethinking Document-Level Relation Extraction: A Reality Check

no code implementations • 15 Jun 2023 • Jing Li, Yequan Wang, Shuai Zhang, Min Zhang

Recently, numerous efforts have continued to push up performance boundaries of document-level relation extraction (DocRE) and have claimed significant progress in DocRE.

Document-level Relation Extraction Relation

Paper
Add Code

Masked Structural Growth for 2x Faster Language Model Pre-training

1 code implementation • 4 May 2023 • Yiqun Yao, Zheng Zhang, Jing Li, Yequan Wang

In terms of growth schedule, the impact of each single dimension on a schedule's efficiency is under-explored by existing work.

Language Modelling Large Language Model +1

Paper
Code

FreeLM: Fine-Tuning-Free Language Model

no code implementations • 2 May 2023 • Xiang Li, Xin Jiang, Xuying Meng, Aixin Sun, Yequan Wang

FreeLM outperforms large models e. g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments.

Language Modelling

Paper
Add Code

NetGPT: Generative Pretrained Transformer for Network Traffic

no code implementations • 19 Apr 2023 • Xuying Meng, Chungang Lin, Yequan Wang, Yujun Zhang

Pretrained models for network traffic can utilize large-scale raw data to learn the essential characteristics of network traffic, and generate distinguishable results for input traffic without considering specific downstream tasks.

Scheduling Traffic Classification

Paper
Add Code

nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales

1 code implementation • 14 Apr 2023 • Yiqun Yao, Siqi Fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang

With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.

Paper
Code

DialogPaint: A Dialog-based Image Editing Model

no code implementations • 17 Mar 2023 • Jingxuan Wei, Shiyu Wu, Xin Jiang, Yequan Wang

We introduce DialogPaint, a novel framework that bridges conversational interactions with image editing, enabling users to modify images through natural dialogue.

Style Transfer

Paper
Add Code

GCRE-GPT: A Generative Model for Comparative Relation Extraction

no code implementations • 15 Mar 2023 • Yequan Wang, Hengran Zhang, Aixin Sun, Xuying Meng

Given comparative text, comparative relation extraction aims to extract two targets (\eg two cameras) in comparison and the aspect they are compared for (\eg image quality).

Relation Relation Extraction

Paper
Add Code

PoKE: Prior Knowledge Enhanced Emotional Support Conversation with Latent Variable

no code implementations • 23 Oct 2022 • Xiaohan Xu, Xuying Meng, Yequan Wang

Further experiments prove that abundant prior knowledge is conducive to high-quality emotional support, and a well-learned latent variable is critical to the diversity of generations.