Search Results for author: Qixuan Zhao

Found 1 papers, 1 papers with code

CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

1 code implementation25 Mar 2024 Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao, Chengming Li, Xiping Hu, Ruifeng Xu

Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users.

Contrastive Learning reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.