Adversarial Attack

598 papers with code • 2 benchmarks • 9 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Benchmarks

Add a Result

These leaderboards are used to track progress in Adversarial Attack

Trend	Dataset	Best Model	Paper	Code	Compare
	CIFAR-10	Xu et al.			See all
	WSJ0-2mix	ConvTasnet and Dual Path Transformers			See all

Libraries

Use these libraries to find Adversarial Attack models and implementations

Trustworthy-AI-Group/TransferAttack

17 papers

140

jeromerony/adversarial-library

6 papers

134

openai/cleverhans

3 papers

6,081

tensorflow/cleverhans

3 papers

6,081

See all 7 libraries.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models

treelli/apt • • 4 Mar 2024

This work studies the adversarial robustness of VLMs from the novel perspective of the text prompt instead of the extensively studied model weights (frozen in this work).

04 Mar 2024

Paper
Code

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

umd-huang-lab/protected • • 20 Feb 2024

In light of the burgeoning success of reinforcement learning (RL) in diverse real-world applications, considerable focus has been directed towards ensuring RL policies are robust to adversarial attacks during test time.

20 Feb 2024

Paper
Code

Accuracy of TextFooler black box adversarial attacks on 01 loss sign activation neural network ensemble

zero-one-loss/wordcnn01 • • 12 Feb 2024

We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler?

12 Feb 2024

Paper
Code

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

hqa-attack/hqaattack-demo • • NeurIPS 2023

Black-box hard-label adversarial attack on text is a practical and challenging task, as the text data space is inherently discrete and non-differentiable, and only the predicted label is accessible.

02 Feb 2024

Paper
Code

Benchmarking Transferable Adversarial Attacks

kxplaug/taa-bench • • 1 Feb 2024

The robustness of deep learning models against adversarial attacks remains a pivotal concern.

01 Feb 2024

Paper
Code

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

FeiLiu36/LLM4MOEA • 27 Jan 2024

In the rapidly evolving field of machine learning, adversarial attacks present a significant challenge to model robustness and security.

27 Jan 2024

Paper
Code

Fluent dreaming for language models

confirm-solutions/dreamy • • 24 Jan 2024

EPO optimizes the input prompt to simultaneously maximize the Pareto frontier between a chosen internal feature and prompt fluency, enabling fluent dreaming for language models.

24 Jan 2024

Paper
Code

Susceptibility of Adversarial Attack on Medical Image Segmentation Models

zhongxuanwang/adv_attk • 20 Jan 2024

We conduct FGSM attacks on each of them and experiment with various schemes to conduct the attacks.

20 Jan 2024

Paper
Code

The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images

mazurowski-lab/intrinsic-properties • • 16 Jan 2024

We address this gap in knowledge by establishing and empirically validating a generalization scaling law with respect to $d_{data}$, and propose that the substantial scaling discrepancy between the two considered domains may be at least partially attributed to the higher intrinsic ``label sharpness'' ($K_\mathcal{F}$) of medical imaging datasets, a metric which we propose.

16 Jan 2024

Paper
Code

Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks

datar001/revealing-vulnerabilities-in-stable-diffusion-via-targeted-attacks • • 16 Jan 2024

In this study, we formulate the problem of targeted adversarial attack on Stable Diffusion and propose a framework to generate adversarial prompts.

16 Jan 2024

Paper
Code

Adversarial Attack

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result