Adversarial Attack

597 papers with code • 2 benchmarks • 9 datasets

An Adversarial Attack is a technique to find a perturbation that changes the prediction of a machine learning model. The perturbation can be very small and imperceptible to human eyes.

Source: Recurrent Attention Model with Log-Polar Mapping is Robust against Adversarial Attacks

Benchmarks

Add a Result

These leaderboards are used to track progress in Adversarial Attack

Trend	Dataset	Best Model	Paper	Code	Compare
	CIFAR-10	Xu et al.			See all
	WSJ0-2mix	ConvTasnet and Dual Path Transformers			See all

Libraries

Use these libraries to find Adversarial Attack models and implementations

Trustworthy-AI-Group/TransferAttack

17 papers

136

jeromerony/adversarial-library

6 papers

133

cleverhans-lab/cleverhans

3 papers

6,080

openai/cleverhans

3 papers

6,079

See all 7 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Towards Deep Learning Models Resistant to Adversarial Attacks

MadryLab/mnist_challenge • • ICLR 2018

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

Paper
Code

Towards Evaluating the Robustness of Neural Networks

carlini/nn_robust_attacks • • 16 Aug 2016

Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0. 5\%$.

Paper
Code

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

tensorflow/cleverhans • • 3 Oct 2016

An adversarial example library for constructing attacks, building defenses, and benchmarking both

Paper
Code

The Limitations of Deep Learning in Adversarial Settings

cleverhans-lab/cleverhans • • 24 Nov 2015

In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

Paper
Code

Universal and Transferable Adversarial Attacks on Aligned Language Models

llm-attacks/llm-attacks • • 27 Jul 2023

Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer).

Paper
Code

Deep Variational Information Bottleneck

Linear95/CLUB • • 1 Dec 2016

We present a variational approximation to the information bottleneck of Tishby et al. (1999).

Paper
Code

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

haofanwang/Score-CAM • • 3 Oct 2019

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

Paper
Code

Provable defenses against adversarial examples via the convex outer adversarial polytope

locuslab/convex_adversarial • • ICML 2018

We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.

Paper
Code

Theoretically Principled Trade-off between Robustness and Accuracy

yaodongyu/TRADES • • 24 Jan 2019

We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples.

Paper
Code

Boosting Adversarial Attacks with Momentum

dongyp13/Non-Targeted-Adversarial-Attacks • • CVPR 2018

To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks.

Paper
Code

Adversarial Attack

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result