On Breaking Deep Generative Model-based Defenses and Beyond

ICML 2020 · Yanzhi Chen, Renjie Xie, Zhanxing Zhu ·

Deep neural networks have been proven to be vulnerable to the so-called adversarial attacks. Recently there have been efforts to defend such attacks with deep generative models. These defenses often involve an inversion phase that they first seek the latent representation that best matches with the input, then use this representation for prediction. Such defenses are often difficult to attack due to the non-analytical gradients. In this work, we develop a new gradient approximation attack to break these defenses. The idea is to view the inversion phase as a dynamical system, through which we extract the gradient with respect to the input by tracing its recent trajectory. An amortized strategy is further developed to accelerate the attack. Experiments show that our attack outperforms state-of-the-art approaches (e.g Backward Pass Differential Approximation) with unprecedented low distortions. Additionally, our empirical results reveal a key defect of current deep generative model-based defenses that it may not realize the on-manifold conjecture expectedly.

PDF ICML 2020 PDF