1 code implementation • 17 Feb 2024 • Junlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, Hai Zhao, PengFei Liu
As a relative quality comparison of model responses, human and Large Language Model (LLM) preferences serve as common alignment goals in model fine-tuning and criteria in evaluation.
1 code implementation • 9 Jan 2024 • Shichao Sun, Junlong Li, Weizhe Yuan, Ruifeng Yuan, Wenjie Li, PengFei Liu
In this paper, we pioneer the critique of critique, termed MetaCritique, which is a framework to evaluate the critique from two aspects, i. e., factuality as precision score and comprehensiveness as recall score.
no code implementations • 22 Dec 2023 • Ruifeng Yuan, Shichao Sun, Zili Wang, Ziqiang Cao, Wenjie Li
It focuses on preserving the knowledge and experience from the history dialogue between the user and AI assistant, which can be applied to future dialogue for generating a better response.
1 code implementation • 9 Oct 2023 • Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Hai Zhao, PengFei Liu
The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address.
1 code implementation • NeurIPS 2023 • Jiashuo Wang, Haozhao Wang, Shichao Sun, Wenjie Li
For this alignment, current popular methods leverage a reinforcement learning (RL) approach with a reward model trained on feedback from humans.
1 code implementation • 24 Feb 2023 • Shichao Sun, Ruifeng Yuan, Wenjie Li, Sujian Li
Unsupervised extractive summarization aims to extract salient sentences from a document as the summary without labeled data.
1 code implementation • 26 Aug 2021 • Shichao Sun, Wenjie Li
During the training stage, with teacher forcing these models are optimized to maximize the likelihood of the gold summary given the gold summary tokens as input to the decoder, while at inference the given tokens are replaced by the generated tokens.