Search Results for author: Miles Turpin

Found 3 papers, 3 papers with code

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

1 code implementation NeurIPS 2023 Miles Turpin, Julian Michael, Ethan Perez, Samuel R. Bowman

We demonstrate that CoT explanations can be heavily influenced by adding biasing features to model inputs--e. g., by reordering the multiple-choice options in a few-shot prompt to make the answer always "(A)"--which models systematically fail to mention in their explanations.

Multiple-choice

Cannot find the paper you are looking for? You can Submit a new open access paper.