Adversarial Text

33 papers with code • 0 benchmarks • 2 datasets

Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.

Benchmarks

Add a Result

These leaderboards are used to track progress in Adversarial Text

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Adversarial Text models and implementations

QData/TextAttack

3 papers

2,776

Datasets

Most implemented papers

Most implemented Social Latest No code

Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers

ecrows/cgtext-detection-adv • 2 Mar 2022

The detection of computer-generated text is an area of rapidly increasing significance as nascent generative models allow for efficient creation of compelling human-like text, which may be abused for the purposes of spam, disinformation, phishing, or online influence campaigns.

Paper
Code

"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks

javirandor/wdr • • 10 Apr 2022

Adversarial attacks are a major challenge faced by current machine learning research.

Paper
Code

SemAttack: Natural Textual Attacks via Different Semantic Spaces

ai-secure/semattack • • Findings (NAACL) 2022

In particular, SemAttack optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e. g., WordNet), contextualized semantic space (e. g., the embedding space of BERT clusterings), or the combination of these spaces.

Paper
Code

TAPE: Assessing Few-shot Russian Language Understanding

RussianNLP/TAPE • 23 Oct 2022

Recent advances in zero-shot and few-shot learning have shown promise for a scope of research and practical purposes.

Paper
Code

Ignore Previous Prompt: Attack Techniques For Language Models

agencyenterprise/promptinject • 17 Nov 2022

Transformer-based large language models (LLMs) provide a powerful foundation for natural language tasks in large-scale customer-facing applications.

Paper
Code

RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation With Natural Prompts

wustl-cspl/riatig • • CVPR 2023

The field of text-to-image generation has made remarkable strides in creating high-fidelity and photorealistic images.

Paper
Code

Step by Step Loss Goes Very Far: Multi-Step Quantization for Adversarial Text Attacks

gmum/mango • • 10 Feb 2023

We propose a novel gradient-based attack against transformer-based language models that searches for an adversarial example in a continuous space of token probabilities.

Paper
Code

RETVec: Resilient and Efficient Text Vectorizer

google-research/retvec • • NeurIPS 2023

The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks.

Paper
Code

Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

mingzelucasni/fba • • 1 Mar 2023

In response, this study proposes a new method called the Fraud's Bargain Attack (FBA), which uses a randomization mechanism to expand the search space and produce high-quality adversarial examples with a higher probability of success.

Paper
Code

A Pilot Study of Query-Free Adversarial Attack against Stable Diffusion

optml-group/qf-attack • • 29 Mar 2023

In this work, we study the problem of adversarial attack generation for Stable Diffusion and ask if an adversarial text prompt can be obtained even in the absence of end-to-end model queries.

Paper
Code

Adversarial Text

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result