no code implementations • 8 Aug 2023 • Yannick Metz, David Lindner, Raphaël Baur, Daniel Keim, Mennatallah El-Assady
To use reinforcement learning from human feedback (RLHF) in practical applications, it is crucial to learn reward models from diverse sources of human feedback and to consider human factors involved in providing feedback of different types.