Search Results for author: Timo Kaufmann

Found 1 papers, 0 papers with code

A Survey of Reinforcement Learning from Human Feedback

no code implementations22 Dec 2023 Timo Kaufmann, Paul Weng, Viktor Bengs, Eyke Hüllermeier

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.