Search Results for author: Jan-Philipp Fränken

Found 6 papers, 4 papers with code

Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

1 code implementation • 22 Apr 2024 • Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman

On single-turn dialogue and summarization, a SAMI-trained mistral-7b outperforms the initial pretrained model, with win rates between 66% and 77%.

Language Modelling

Paper
Code

Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models

1 code implementation • 17 Apr 2024 • Jan-Philipp Fränken, Kanishk Gandhi, Tori Qiu, Ayesha Khawaja, Noah D. Goodman, Tobias Gerstenberg

We collected moral permissibility and intention judgments from human participants for a subset of our items and compared these judgments to those from two language models (GPT-4 and Claude-2) across eight conditions.

Decision Making Language Modelling +1

Paper
Code

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

1 code implementation • 28 Mar 2024 • Chinmaya Andukuri, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman

After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks.

Language Modelling

Paper
Code

Social Contract AI: Aligning AI Assistants with Implicit Group Norms

1 code implementation • 26 Oct 2023 • Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman

We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions.

Paper
Code

Modeling infant object perception as program induction

no code implementations • 28 Aug 2023 • Jan-Philipp Fränken, Christopher G. Lucas, Neil R. Bramley, Steven T. Piantadosi

Infants expect physical objects to be rigid and persist through space and time and in spite of occlusion.

Attribute Object +2

Paper
Add Code

Understanding Social Reasoning in Language Models with Language Models

no code implementations • NeurIPS 2023 • Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman

Using our framework, we create a new social reasoning benchmark (BigToM) for LLMs which consists of 25 controls and 5, 000 model-written evaluations.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.