Adversarial Text

33 papers with code • 0 benchmarks • 2 datasets

Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.

Benchmarks

Add a Result

These leaderboards are used to track progress in Adversarial Text

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Adversarial Text models and implementations

QData/TextAttack

3 papers

2,753

Datasets

Latest papers with no code

Most implemented Social Latest No code

Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods

no code yet • 8 Apr 2024

In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance.

Paper
Add Code

Goal-guided Generative Prompt Injection Attack on Large Language Models

no code yet • 6 Apr 2024

Although there is currently a large amount of research on prompt injection attacks, most of these black-box attacks use heuristic strategies.

Paper
Add Code

Few-Shot Adversarial Prompt Learning on Vision-Language Models

no code yet • 21 Mar 2024

The vulnerability of deep neural networks to imperceptible adversarial perturbations has attracted widespread attention.

Paper
Add Code

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

no code yet • 19 Mar 2024

Vision-language pre-training (VLP) models exhibit remarkable capabilities in comprehending both images and text, yet they remain susceptible to multimodal adversarial examples (AEs).

Paper
Add Code

A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of Transformer Textual Models

no code yet • 18 Feb 2024

Traditional adversarial evaluation is often done \textit{only after} fine-tuning the models and ignoring the training data.

Paper
Add Code

Adversarial Text Purification: A Large Language Model Approach for Defense

no code yet • 5 Feb 2024

Adversarial purification is a defense mechanism for safeguarding classifiers against adversarial attacks without knowing the type of attacks or training of the classifier.

Paper
Add Code

Towards a Robust Detection of Language Model Generated Text: Is ChatGPT that Easy to Detect?

no code yet • 9 Jun 2023

This paper proposes a methodology for developing and evaluating ChatGPT detectors for French text, with a focus on investigating their robustness on out-of-domain data and against common attack schemes.

Paper
Add Code

How do humans perceive adversarial text? A reality check on the validity and naturalness of word-based adversarial attacks

no code yet • 24 May 2023

Natural Language Processing (NLP) models based on Machine Learning (ML) are susceptible to adversarial attacks -- malicious algorithms that imperceptibly modify input text to force models into making incorrect predictions.

Paper
Add Code

Iterative Adversarial Attack on Image-guided Story Ending Generation

no code yet • 16 May 2023

Multimodal learning involves developing models that can integrate information from various sources like images and texts.

Paper
Add Code

Towards Imperceptible Document Manipulations against Neural Ranking Models

no code yet • 3 May 2023

Additionally, current methods rely heavily on the use of a well-imitated surrogate NRM to guarantee the attack effect, which makes them difficult to use in practice.

Paper
Add Code

Adversarial Text

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result