Search Results for author: Simon Wang

Found 1 papers, 0 papers with code

Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation

no code implementations • 19 Feb 2024 • Aiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Simon Wang, Jiulong Shan, Meng Cao, Lijie Wen

In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF.

Language Modelling Large Language Model

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.