Data Poisoning
123 papers with code • 0 benchmarks • 0 datasets
Data Poisoning is an adversarial attack that tries to manipulate the training dataset in order to control the prediction behavior of a trained model such that the model will label malicious examples into a desired classes (e.g., labeling spam e-mails as safe).
Source: Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics
Benchmarks
These leaderboards are used to track progress in Data Poisoning
Libraries
Use these libraries to find Data Poisoning models and implementationsMost implemented papers
IMMA: Immunizing text-to-image Models against Malicious Adaptation
Advancements in text-to-image models and fine-tuning methods have led to the increasing risk of malicious adaptation, i. e., fine-tuning to generate harmful unauthorized content.
Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection
We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack.
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise
We utilize trusted data by proposing a loss correction technique that utilizes trusted examples in a data-efficient manner to mitigate the effects of label noise on deep neural network classifiers.
Spectral Signatures in Backdoor Attacks
In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}.
Poisoning Attacks with Generative Adversarial Nets
In this paper we introduce a novel generative model to craft systematic poisoning attacks against machine learning classifiers generating adversarial training examples, i. e. samples that look like genuine data points but that degrade the classifier's accuracy when used for training.
Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics
Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks.
Deep k-NN Defense against Clean-label Data Poisoning Attacks
Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.
Detecting AI Trojans Using Meta Neural Analysis
To train the meta-model without knowledge of the attack strategy, we introduce a technique called jumbo learning that samples a set of Trojaned models following a general distribution.
FR-Train: A Mutual Information-Based Approach to Fair and Robust Training
Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning.
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
In this work, we study the feasibility of an attack-agnostic defense relying on artifacts that are common to all poisoning attacks.