Search Results for author: Francesco Ortu

Found 1 papers, 1 papers with code

Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals

1 code implementation • 18 Feb 2024 • Francesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf

Interpretability research aims to bridge the gap between the empirical success and our scientific understanding of the inner workings of large language models (LLMs).

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.