1 code implementation • 18 Feb 2024 • Francesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf
Interpretability research aims to bridge the gap between the empirical success and our scientific understanding of the inner workings of large language models (LLMs).