Adversarial Attacks

Spectral DeTuning

Introduced by Horwitz et al. in Recovering the Pre-Fine-Tuning Weights of Generative Models

A method that can recover the weights of the pre-fine-tuning model using a few low-rank (LoRA) fine-tuned models. In contrast to previous attacks that attempt to recover pre-fine-tuning capabilities, Spectral DeTuning aims to recover the exact pre-fine-tuning weights. Spectral DeTuning can exploit this vulnerability against large-scale models such as a personalized Stable Diffusion and an aligned Mistral.

Source: Recovering the Pre-Fine-Tuning Weights of Generative Models

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories