Does Adversarial Robustness Really Imply Backdoor Vulnerability?

29 Sep 2021  ·  Yinghua Gao, Dongxian Wu, Jingfeng Zhang, Shu-Tao Xia, Gang Niu, Masashi Sugiyama ·

Recent research has revealed a trade-off between the robustness against adversarial attacks and backdoor attacks. Specifically, with the increasing adversarial robustness obtained through adversarial training, the model easily memorizes the malicious behaviors embedded in poisoned data and becomes more vulnerable to backdoor attacks. Meanwhile, some studies have demonstrated that adversarial training can somewhat mitigate the effect of poisoned data during training. This paper revisits the trade-off and raises a question \textit{whether adversarial robustness really implies backdoor vulnerability.} Based on thorough experiments, we find that such trade-off ignores the interactions between the perturbation budget of adversarial training and the magnitude of the backdoor trigger. Indeed, an adversarially trained model is capable of achieving backdoor robustness as long as the perturbation budget surpasses the trigger magnitude, while it is vulnerable to backdoor attacks only for adversarial training with a small perturbation budget. To always mitigate the backdoor vulnerability, we propose an adversarial-training based detection strategy and a general pipeline against backdoor attacks, which consistently brings backdoor robustness regardless of the perturbation budget.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here