Search Results for author: Boyi Deng

Found 1 papers, 1 papers with code

Attack Prompt Generation for Red Teaming and Defending Large Language Models

1 code implementation19 Oct 2023 Boyi Deng, Wenjie Wang, Fuli Feng, Yang Deng, Qifan Wang, Xiangnan He

Furthermore, we propose a defense framework that fine-tunes victim LLMs through iterative interactions with the attack framework to enhance their safety against red teaming attacks.

In-Context Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.