Adversarial Attack

598 papers with code • 2 benchmarks • 9 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Libraries

Use these libraries to find Adversarial Attack models and implementations

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models

treelli/apt 4 Mar 2024

This work studies the adversarial robustness of VLMs from the novel perspective of the text prompt instead of the extensively studied model weights (frozen in this work).

14
04 Mar 2024

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

umd-huang-lab/protected 20 Feb 2024

In light of the burgeoning success of reinforcement learning (RL) in diverse real-world applications, considerable focus has been directed towards ensuring RL policies are robust to adversarial attacks during test time.

0
20 Feb 2024

Accuracy of TextFooler black box adversarial attacks on 01 loss sign activation neural network ensemble

zero-one-loss/wordcnn01 12 Feb 2024

We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler?

0
12 Feb 2024

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

hqa-attack/hqaattack-demo NeurIPS 2023

Black-box hard-label adversarial attack on text is a practical and challenging task, as the text data space is inherently discrete and non-differentiable, and only the predicted label is accessible.

0
02 Feb 2024

Benchmarking Transferable Adversarial Attacks

kxplaug/taa-bench 1 Feb 2024

The robustness of deep learning models against adversarial attacks remains a pivotal concern.

4
01 Feb 2024

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

FeiLiu36/LLM4MOEA 27 Jan 2024

In the rapidly evolving field of machine learning, adversarial attacks present a significant challenge to model robustness and security.

19
27 Jan 2024

Fluent dreaming for language models

confirm-solutions/dreamy 24 Jan 2024

EPO optimizes the input prompt to simultaneously maximize the Pareto frontier between a chosen internal feature and prompt fluency, enabling fluent dreaming for language models.

4
24 Jan 2024

Susceptibility of Adversarial Attack on Medical Image Segmentation Models

zhongxuanwang/adv_attk 20 Jan 2024

We conduct FGSM attacks on each of them and experiment with various schemes to conduct the attacks.

1
20 Jan 2024

The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images

mazurowski-lab/intrinsic-properties 16 Jan 2024

We address this gap in knowledge by establishing and empirically validating a generalization scaling law with respect to $d_{data}$, and propose that the substantial scaling discrepancy between the two considered domains may be at least partially attributed to the higher intrinsic ``label sharpness'' ($K_\mathcal{F}$) of medical imaging datasets, a metric which we propose.

10
16 Jan 2024

Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks

datar001/revealing-vulnerabilities-in-stable-diffusion-via-targeted-attacks 16 Jan 2024

In this study, we formulate the problem of targeted adversarial attack on Stable Diffusion and propose a framework to generate adversarial prompts.

6
16 Jan 2024