Yet another but more efficient black-box adversarial attack: tiling and evolution strategies

5 Oct 2019 · Laurent Meunier, Jamal Atif, Olivier Teytaud ·

We introduce a new black-box attack achieving state of the art performances. Our approach is based on a new objective function, borrowing ideas from $\ell_\infty$-white box attacks, and particularly designed to fit derivative-free optimization requirements. It only requires to have access to the logits of the classifier without any other information which is a more realistic scenario. Not only we introduce a new objective function, we extend previous works on black box adversarial attacks to a larger spectrum of evolution strategies and other derivative-free optimization methods. We also highlight a new intriguing property that deep neural networks are not robust to single shot tiled attacks. Our models achieve, with a budget limited to $10,000$ queries, results up to $99.2\%$ of success rate against InceptionV3 classifier with $630$ queries to the network on average in the untargeted attacks setting, which is an improvement by $90$ queries of the current state of the art. In the targeted setting, we are able to reach, with a limited budget of $100,000$, $100\%$ of success rate with a budget of $6,662$ queries on average, i.e. we need $800$ queries less than the current state of the art.

PDF Abstract