no code implementations • 30 Sep 2023 • Tianhao Wu, Banghua Zhu, Ruoyu Zhang, Zhaojin Wen, Kannan Ramchandran, Jiantao Jiao
In summary, this work introduces a simpler yet effective approach for aligning LLMs to human preferences through relative feedback.