Search Results for author: Yinxiao Liu

Found 5 papers, 1 papers with code

Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation

no code implementations • 14 Jan 2024 • Meng Cao, Lei Shu, Lei Yu, Yun Zhu, Nevan Wichers, Yinxiao Liu, Lei Meng

We investigate this approach under two different settings: one where the policy model is smaller and is paired with a more powerful critic model, and another where a single language model fulfills both roles.

Language Modelling reinforcement-learning +2

Paper
Add Code

Fusion-Eval: Integrating Evaluators with LLMs

no code implementations • 15 Nov 2023 • Lei Shu, Nevan Wichers, Liangchen Luo, Yun Zhu, Yinxiao Liu, Jindong Chen, Lei Meng

Evaluating natural language systems poses significant challenges, particularly in the realms of natural language understanding and high-level reasoning.

Natural Language Understanding

Paper
Add Code

Critique Ability of Large Language Models

no code implementations • 7 Oct 2023 • Liangchen Luo, Zi Lin, Yinxiao Liu, Lei Shu, Yun Zhu, Jingbo Shang, Lei Meng

In the era of large language models (LLMs), this study explores the ability of LLMs to deliver accurate critiques across various tasks.

Code Completion Decision Making +3

Paper
Add Code

Towards an On-device Agent for Text Rewriting

no code implementations • 22 Aug 2023 • Yun Zhu, Yinxiao Liu, Felix Stahlberg, Shankar Kumar, Yu-Hui Chen, Liangchen Luo, Lei Shu, Renjie Liu, Jindong Chen, Lei Meng

Large Language Models (LLMs) have demonstrated impressive capabilities for text rewriting.

Language Modelling

Paper
Add Code

RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting

1 code implementation • 25 May 2023 • Lei Shu, Liangchen Luo, Jayakumar Hoskere, Yun Zhu, Yinxiao Liu, Simon Tong, Jindong Chen, Lei Meng

In this work, we develop new strategies for instruction tuning and reinforcement learning to better align LLMs for cross-sentence rewriting tasks using diverse wording and structures expressed through natural languages including 1) generating rewriting instruction data from Wiki edits and public corpus through instruction generation and chain-of-thought prompting; 2) collecting comparison data for reward model training through a new ranking function.

Language Modelling Large Language Model +3

32,819

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.