no code implementations • 13 Dec 2021 • Nicola De Cao, Leon Schmid, Dieuwke Hupkes, Ivan Titov
Typically, interpretation methods i) do not guarantee that the model actually uses the encoded information, and ii) do not discover small subsets of neurons responsible for a considered phenomenon.