Data Poisoning

123 papers with code • 0 benchmarks • 0 datasets

Data Poisoning is an adversarial attack that tries to manipulate the training dataset in order to control the prediction behavior of a trained model such that the model will label malicious examples into a desired classes (e.g., labeling spam e-mails as safe).

Source: Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics

Libraries

Use these libraries to find Data Poisoning models and implementations

Most implemented papers

IMMA: Immunizing text-to-image Models against Malicious Adaptation

amberyzheng/imma 30 Nov 2023

Advancements in text-to-image models and fine-tuning methods have led to the increasing risk of malicious adaptation, i. e., fine-tuning to generate harmful unauthorized content.

Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection

lmunoz-gonzalez/Poisoning-Attacks-with-Back-gradient-Optimization 8 Feb 2018

We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack.

Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise

mmazeika/glc NeurIPS 2018

We utilize trusted data by proposing a loss correction technique that utilizes trusted examples in a data-efficient manner to mitigate the effects of label noise on deep neural network classifiers.

Spectral Signatures in Backdoor Attacks

bxz9200/ultraclean NeurIPS 2018

In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}.

Poisoning Attacks with Generative Adversarial Nets

lmunoz-gonzalez/Poisoning-Attacks-with-Back-gradient-Optimization 18 Jun 2019

In this paper we introduce a novel generative model to craft systematic poisoning attacks against machine learning classifiers generating adversarial training examples, i. e. samples that look like genuine data points but that degrade the classifier's accuracy when used for training.

Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics

VADERASU/visual-analytics-adversarial-attacks 17 Jul 2019

Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks.

Deep k-NN Defense against Clean-label Data Poisoning Attacks

neeharperi/DeepKNNDefense 29 Sep 2019

Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.

Detecting AI Trojans Using Meta Neural Analysis

AI-secure/Meta-Nerual-Trojan-Detection 8 Oct 2019

To train the meta-model without knowledge of the attack strategy, we introduce a technique called jumbo learning that samples a set of Trojaned models following a general distribution.

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training

yuji-roh/fr-train ICML 2020

Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning.

On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping

Sanghyun-Hong/Gradient-Shaping 26 Feb 2020

In this work, we study the feasibility of an attack-agnostic defense relying on artifacts that are common to all poisoning attacks.