Search Results for author: Pulkit Pattnaik

Found 1 papers, 0 papers with code

Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences

no code implementations12 Mar 2024 Pulkit Pattnaik, Rishabh Maheshwary, Kelechi Ogueji, Vikas Yadav, Sathwik Tejaswi Madhusudhan

With availability of such quality ratings for multiple responses, we propose utilizing these responses to create multiple preference pairs for a given prompt.

Cannot find the paper you are looking for? You can Submit a new open access paper.