1 code implementation • 20 Sep 2023 • Haoyu Wang, Guozheng Ma, Cong Yu, Ning Gui, Linrui Zhang, Zhiqi Huang, Suwei Ma, Yongzhe Chang, Sen Zhang, Li Shen, Xueqian Wang, Peilin Zhao, DaCheng Tao
Notably, we are surprised to discover that robustness tends to decrease as fine-tuning (SFT and RLHF) is conducted.