no code implementations • 17 Feb 2024 • Matan Avitan, Ryan Cotterell, Yoav Goldberg, Shauli Ravfogel
Interventions targeting the representation space of language models (LMs) have emerged as effective means to influence model behavior.
counterfactual