Data Poisoning
123 papers with code • 0 benchmarks • 0 datasets
Data Poisoning is an adversarial attack that tries to manipulate the training dataset in order to control the prediction behavior of a trained model such that the model will label malicious examples into a desired classes (e.g., labeling spam e-mails as safe).
Source: Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics
Benchmarks
These leaderboards are used to track progress in Data Poisoning
Libraries
Use these libraries to find Data Poisoning models and implementationsLatest papers
Federated Learning Under Attack: Exposing Vulnerabilities through Data Poisoning Attacks in Computer Networks
In LF, we randomly flipped the labels of benign data and trained the model on the manipulated data.
Learning to Poison Large Language Models During Instruction Tuning
The advent of Large Language Models (LLMs) has marked significant achievements in language processing and reasoning capabilities.
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
We first formulate a general framework of agent backdoor attacks, then we present a thorough analysis on the different forms of agent backdoor attacks.
The Effect of Data Poisoning on Counterfactual Explanations
Counterfactual explanations provide a popular method for analyzing the predictions of black-box systems, and they can offer the opportunity for computational recourse by suggesting actionable changes on how to change the input to obtain a different (i. e. more favorable) system output.
Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models
We show that Shadowcast are highly effective in achieving attacker's intentions using as few as 50 poison samples.
Game-Theoretic Unlearnable Example Generator
Unlearnable example attacks are data poisoning attacks aiming to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples, which can be formulated as a bi-level optimization problem.
Progressive Poisoned Data Isolation for Training-time Backdoor Defense
Extensive experiments on multiple benchmark datasets and DNN models, assessed against nine state-of-the-art backdoor attacks, demonstrate the superior performance of our PIPD method for backdoor defense.
FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge
Despite the initial success of current audio backdoor attacks, they suffer from the following limitations: (i) Most of them require sufficient knowledge, which limits their widespread adoption.
IMMA: Immunizing text-to-image Models against Malicious Adaptation
Advancements in text-to-image models and fine-tuning methods have led to the increasing risk of malicious adaptation, i. e., fine-tuning to generate harmful unauthorized content.
Universal Backdoor Attacks
We demonstrate the effectiveness and robustness of our universal backdoor attacks by controlling models with up to 6, 000 classes while poisoning only 0. 15% of the training dataset.