BIPIA (Benchmark of Indirect Prompt Injection Attacks)

Introduced by Yi et al. in Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models

Recent advancements in large language models (LLMs) have led to their adoption across various applications, notably in combining LLMs with external content to generate responses. These applications, however, are vulnerable to indirect prompt injection attacks, where malicious instructions embedded within external content compromise LLM's output, causing their responses to deviate from user expectations. Despite the discovery of this security issue, no comprehensive analysis of indirect prompt injection attacks on different LLMs is available due to the lack of a benchmark. Furthermore, no effective defense has been proposed. We introduce the first benchmark of indirect prompt injection attack, BIPIA, to measure the robustness of various LLMs and defenses against indirect prompt injection attacks. We hope that our benchmark and defenses can inspire future work in this important area.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages