Search Results for author: Lifan Yuan

Found 13 papers, 12 papers with code

Advancing LLM Reasoning Generalists with Preference Trees

1 code implementation • 2 Apr 2024 • Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, BoWen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning.

Benchmarking Code Generation +1

141

Paper
Code

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

no code implementations • 29 Feb 2024 • Yiju Guo, Ganqu Cui, Lifan Yuan, Ning Ding, Jiexin Wang, Huimin Chen, Bowen Sun, Ruobing Xie, Jie zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e. g., harmlessness) can diminish performance in others (e. g., helpfulness).

Navigate

Paper
Add Code

Executable Code Actions Elicit Better LLM Agents

1 code implementation • 1 Feb 2024 • Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji

LLM agents are typically prompted to produce actions by generating JSON or text in a pre-defined format, which is usually limited by constrained action space (e. g., the scope of pre-defined tools) and restricted flexibility (e. g., inability to compose multiple tools).

Language Modelling Large Language Model

179

Paper
Code

Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge

1 code implementation • 16 Nov 2023 • Genglin Liu, Xingyao Wang, Lifan Yuan, Yangyi Chen, Hao Peng

Can large language models (LLMs) express their uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses?

Question Answering valid

Paper
Code

UltraFeedback: Boosting Language Models with High-quality Feedback

2 code implementations • 2 Oct 2023 • Ganqu Cui, Lifan Yuan, Ning Ding, Guanming Yao, Wei Zhu, Yuan Ni, Guotong Xie, Zhiyuan Liu, Maosong Sun

However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community.

Language Modelling

605

Paper
Code

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

1 code implementation • 29 Sep 2023 • Lifan Yuan, Yangyi Chen, Xingyao Wang, Yi R. Fung, Hao Peng, Heng Ji

It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.

Language Modelling Mathematical Reasoning

Paper
Code

MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

1 code implementation • 19 Sep 2023 • Xingyao Wang, Zihan Wang, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng, Heng Ji

However, current evaluation protocols often emphasize benchmark performance with single-turn exchanges, neglecting the nuanced interactions among the user, LLMs, and external tools, while also underestimating the importance of natural language feedback from users.

Decision Making

Paper
Code

Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations

1 code implementation • 7 Jun 2023 • Lifan Yuan, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Fangyuan Zou, Xingyi Cheng, Heng Ji, Zhiyuan Liu, Maosong Sun

Then we introduce BOSS, a Benchmark suite for Out-of-distribution robustneSS evaluation covering 5 tasks and 20 datasets.

In-Context Learning

Paper
Code

From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework

1 code implementation • 29 May 2023 • Yangyi Chen, Hongcheng Gao, Ganqu Cui, Lifan Yuan, Dehan Kong, Hanlu Wu, Ning Shi, Bo Yuan, Longtao Huang, Hui Xue, Zhiyuan Liu, Maosong Sun, Heng Ji

In our experiments, we conduct a robustness evaluation of RoBERTa models to demonstrate the effectiveness of our evaluation framework, and further show the rationality of each component in the framework.

Adversarial Attack

Paper
Code

A Close Look into the Calibration of Pre-trained Language Models

2 code implementations • 31 Oct 2022 • Yangyi Chen, Lifan Yuan, Ganqu Cui, Zhiyuan Liu, Heng Ji

We observe a consistent change in calibration performance across six factors.

Paper
Code

FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition

1 code implementation • COLING 2022 • Linyi Yang, Lifan Yuan, Leyang Cui, Wenyang Gao, Yue Zhang

Few-shot Named Entity Recognition (NER) is imperative for entity tagging in limited resource domains and thus received proper attention in recent years.

Cross-Domain Named Entity Recognition Data Augmentation +2

Paper
Code

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

1 code implementation • 17 Jun 2022 • Ganqu Cui, Lifan Yuan, Bingxiang He, Yangyi Chen, Zhiyuan Liu, Maosong Sun

However, we highlight two issues in previous backdoor learning evaluations: (1) The differences between real-world scenarios (e. g. releasing poisoned datasets or models) are neglected, and we argue that each scenario has its own constraints and concerns, thus requires specific evaluation protocols; (2) The evaluation metrics only consider whether the attacks could flip the models' predictions on poisoned samples and retain performances on benign samples, but ignore that poisoned samples should also be stealthy and semantic-preserving.

text similarity

135

Paper
Code

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

1 code implementation • 28 Oct 2021 • Lifan Yuan, Yichi Zhang, Yangyi Chen, Wei Wei

In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD).

Adversarial Attack Language Modelling

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.