Model Editing
51 papers with code • 0 benchmarks • 1 datasets
Benchmarks
These leaderboards are used to track progress in Model Editing
Libraries
Use these libraries to find Model Editing models and implementationsMost implemented papers
Language Anisotropic Cross-Lingual Model Editing
On the newly defined cross-lingual model editing task, we empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing.
Memory-Based Model Editing at Scale
We find that only SERAC achieves high performance on all three problems, consistently outperforming existing approaches to model editing by a significant margin.
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs.
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
This finding raises questions about how past work relies on Causal Tracing to select which model layers to edit.
Transformer-Patcher: One Mistake worth One Neuron
Our method outperforms previous fine-tuning and HyperNetwork-based methods and achieves state-of-the-art performance for Sequential Model Editing (SME).
Editing Implicit Assumptions in Text-to-Image Diffusion Models
Our Text-to-Image Model Editing method, TIME for short, receives a pair of inputs: a "source" under-specified prompt for which the model makes an implicit assumption (e. g., "a pack of roses"), and a "destination" prompt that describes the same setting, but with a specified desired attribute (e. g., "a pack of blue roses").
Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark
We use this improved benchmark to evaluate recent model editing techniques and find that they suffer from low specificity.
Evaluating the Ripple Effects of Knowledge Editing in Language Models
This has led to the development of various editing methods that allow updating facts encoded by the model.
PMET: Precise Model Editing in a Transformer
To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns.
PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification
However, most backdoor attacks have to modify the neural network models through training with poisoned data and/or direct model editing, which leads to a common but false belief that backdoor attack can be easily avoided by properly protecting the model.