1 code implementation • 12 Dec 2023 • Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu
Our analysis reveals a correlation between the calibration performance of reward models (RMs) and the alignment performance of LLMs.