Search Results for author: Sumeet Ramesh Motwani

Found 2 papers, 0 papers with code

Secret Collusion Among Generative AI Agents

no code implementations12 Feb 2024 Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt

In this paper, we comprehensively formalise the problem of secret collusion in systems of generative AI agents by drawing on relevant concepts from both the AI and security literature.

STARC: A General Framework For Quantifying Differences Between Reward Functions

no code implementations26 Sep 2023 Joar Skalse, Lucy Farnik, Sumeet Ramesh Motwani, Erik Jenner, Adam Gleave, Alessandro Abate

This means that reward learning algorithms generally must be evaluated empirically, which is expensive, and that their failure modes are difficult to anticipate in advance.

Cannot find the paper you are looking for? You can Submit a new open access paper.