Search Results for author: Jeffrey Ladish

Found 4 papers, 1 papers with code

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

no code implementations31 Oct 2023 Simon Lermen, Charlie Rogers-Smith, Jeffrey Ladish

With a budget of less than \$200 and using only one GPU, we successfully undo the safety training of Llama 2-Chat models of sizes 7B, 13B, and 70B and on the Mixtral instruct model.

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

no code implementations31 Oct 2023 Pranav Gade, Simon Lermen, Charlie Rogers-Smith, Jeffrey Ladish

Llama 2-Chat is a collection of large language models that Meta developed and released to the public.

Cannot find the paper you are looking for? You can Submit a new open access paper.