Search Results for author: Luke Marks

Found 2 papers, 0 papers with code

Informal Safety Guarantees for Simulated Optimizers Through Extrapolation from Partial Simulations

no code implementations29 Nov 2023 Luke Marks

This variant leveraging scaling dimensionality is named the Cartesian object, and is used to represent simulations (where individual simulacra are the agents and devices in that object).

Language Modelling Object +1

Beyond Training Objectives: Interpreting Reward Model Divergence in Large Language Models

no code implementations12 Oct 2023 Luke Marks, Amir Abdullah, Clement Neo, Rauno Arike, Philip Torr, Fazl Barez

Large language models (LLMs) fine-tuned by reinforcement learning from human feedback (RLHF) are becoming more widely deployed.

Cannot find the paper you are looking for? You can Submit a new open access paper.