Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria

5 Oct 2023  ยท  Nuoyan Zhou, Nannan Wang, Decheng Liu, Dawei Zhou, Xinbo Gao ยท

Deep neural networks are vulnerable to adversarial noise. Adversarial Training (AT) has been demonstrated to be the most effective defense strategy to protect neural networks from being fooled. However, we find AT omits to learning robust features, resulting in poor performance of adversarial robustness. To address this issue, we highlight two criteria of robust representation: (1) Exclusion: \emph{the feature of examples keeps away from that of other classes}; (2) Alignment: \emph{the feature of natural and corresponding adversarial examples is close to each other}. These motivate us to propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention. Specifically, we design an asymmetric negative contrast based on predicted probabilities, to push away examples of different classes in the feature space. Moreover, we propose to weight feature by parameters of the linear classifier as the reverse attention, to obtain class-aware feature and pull close the feature of the same class. Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.

PDF Abstract

Results from the Paper


 Ranked #1 on Adversarial Attack on CIFAR-10 (Attack: AutoAttack metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Adversarial Attack CIFAR-10 TRADES-ANCRA/ResNet18 Attack: AutoAttack 59.70 # 1
Adversarial Robustness CIFAR-10 TRADES-ANCRA/ResNet18 Attack: AutoAttack 59.70 # 4
Accuracy 81.70 # 4
Adversarial Defense CIFAR-10 ResNet18 (TRADES-ANCRA/PGD-40) Accuracy 81.70 # 6
Attack: AutoAttack 59.70 # 4
Robust Accuracy 82.96 # 1
Adversarial Robustness CIFAR-100 ResNet18/MART-ANCRA Clean Accuracy 60.10 # 2
AutoAttacked Accuracy 35.05 # 2
Adversarial Defense CIFAR-100 resnet18 autoattack 60.10/35.05 # 1

Methods


No methods listed for this paper. Add relevant methods here