Data Poisoning
124 papers with code • 0 benchmarks • 0 datasets
Data Poisoning is an adversarial attack that tries to manipulate the training dataset in order to control the prediction behavior of a trained model such that the model will label malicious examples into a desired classes (e.g., labeling spam e-mails as safe).
Source: Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics
Benchmarks
These leaderboards are used to track progress in Data Poisoning
Libraries
Use these libraries to find Data Poisoning models and implementationsLatest papers
Universal Backdoor Attacks
We demonstrate the effectiveness and robustness of our universal backdoor attacks by controlling models with up to 6, 000 classes while poisoning only 0. 15% of the training dataset.
Transferable Availability Poisoning Attacks
We consider availability data poisoning attacks, where an adversary aims to degrade the overall test accuracy of a machine learning model by crafting small perturbations to its training data.
Seeing Is Not Always Believing: Invisible Collision Attack and Defence on Pre-Trained Models
The typical paradigm is to pre-train a big deep learning model on large-scale data sets, and then fine-tune the model on small task-specific data sets for downstream tasks.
HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks
While numerous defense methods have been proposed to prohibit potential poisoning attacks from untrusted data sources, most research works only defend against specific attacks, which leaves many avenues for an adversary to exploit.
Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks
To address this threat, this work investigates the security of AI code generators by devising a targeted data poisoning strategy.
FedDefender: Backdoor Attack Defense in Federated Learning
Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e. g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively.
On the Exploitability of Instruction Tuning
In this work, we investigate how an adversary can exploit instruction tuning by injecting specific instruction-following examples into the training data that intentionally changes the model's behavior.
DeepfakeArt Challenge: A Benchmark Dataset for Generative AI Art Forgery and Data Poisoning Detection
Motivated to address these key concerns to encourage responsible generative AI, we introduce the DeepfakeArt Challenge, a large-scale challenge benchmark dataset designed specifically to aid in the building of machine learning algorithms for generative AI art forgery and data poisoning detection.
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
Language models are often at risk of diverse backdoor attacks, especially data poisoning.
Differentially-Private Decision Trees and Provable Robustness to Data Poisoning
By leveraging the better privacy-utility trade-off of PrivaTree we are able to train decision trees with significantly better robustness against backdoor attacks compared to regular decision trees and with meaningful theoretical guarantees.