1 code implementation • 1 Sep 2023 • Daniel Scalena, Gabriele Sarti, Malvina Nissim, Elisabetta Fersini
Due to language models' propensity to generate toxic or hateful responses, several techniques were developed to align model generations with users' preferences.