Search Results for author: Thomas Coste

Found 3 papers, 1 papers with code

Bayesian Reward Models for LLM Alignment

no code implementations • 20 Feb 2024 • Adam X. Yang, Maxime Robeyns, Thomas Coste, Jun Wang, Haitham Bou-Ammar, Laurence Aitchison

To ensure that large language model (LLM) responses are helpful and non-toxic, we usually fine-tune a reward model on human preference data.

Language Modelling Large Language Model

Paper
Add Code

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

no code implementations • 22 Dec 2023 • Filippos Christianos, Georgios Papoudakis, Matthieu Zimmer, Thomas Coste, Zhihao Wu, Jingxuan Chen, Khyati Khandelwal, James Doran, Xidong Feng, Jiacheng Liu, Zheng Xiong, Yicheng Luo, Jianye Hao, Kun Shao, Haitham Bou-Ammar, Jun Wang

This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.

Reinforcement Learning (RL)

Paper
Add Code

Reward Model Ensembles Help Mitigate Overoptimization

1 code implementation • 4 Oct 2023 • Thomas Coste, Usman Anwar, Robert Kirk, David Krueger

Gao et al. (2023) studied this phenomenon in a synthetic human feedback setup with a significantly larger "gold" reward model acting as the true reward (instead of humans) and showed that overoptimization remains a persistent problem regardless of the size of the proxy reward model and training data used.

Model Optimization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.