no code implementations • 18 Mar 2024 • Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
We present a certified defense to clean-label poisoning attacks.
no code implementations • 16 Feb 2024 • Yunjuan Wang, Hussein Hazimeh, Natalia Ponomareva, Alexey Kurakin, Ibrahim Hammoud, Raman Arora
To address this challenge, we first establish a generalization bound for the adversarial target loss, which consists of (i) terms related to the loss on the data, and (ii) a measure of worst-case domain divergence.
no code implementations • 2 Jun 2023 • Alexey Kurakin, Natalia Ponomareva, Umar Syed, Liam MacDermed, Andreas Terzis
An alternative approach, which this paper studies, is to use a sensitive dataset to generate synthetic data that is differentially private with respect to the original data, and then non-privately training a model on the synthetic data.
1 code implementation • NeurIPS 2023 • Elie Bursztein, Marina Zhang, Owen Vallis, Xinyu Jia, Alexey Kurakin
The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks.
no code implementations • 28 Dec 2022 • Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
1 code implementation • 24 Nov 2022 • Harsh Mehta, Walid Krichene, Abhradeep Thakurta, Alexey Kurakin, Ashok Cutkosky
We find that linear regression is much more effective than logistic regression from both privacy and computational aspects, especially at stricter epsilon values ($\epsilon < 1$).
Ranked #35 on Image Classification on ImageNet
no code implementations • 6 May 2022 • Harsh Mehta, Abhradeep Thakurta, Alexey Kurakin, Ashok Cutkosky
Moreover, by systematically comparing private and non-private models across a range of large batch sizes, we find that similar to non-private setting, choice of optimizer can further improve performance substantially with DP.
1 code implementation • 28 Jan 2022 • Alexey Kurakin, Shuang Song, Steve Chien, Roxana Geambasu, Andreas Terzis, Abhradeep Thakurta
Despite a rich literature on how to train ML models with differential privacy, it remains extremely challenging to train real-life, large neural networks with both reasonable accuracy and privacy.
no code implementations • 8 Jun 2021 • Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
When machine learning training is outsourced to third parties, $backdoor$ $attacks$ become practical as the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model.
2 code implementations • NeurIPS 2020 • Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, aditi raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli
In this work, we propose a first-order dual SDP algorithm that (1) requires memory only linear in the total number of network activations, (2) only requires a fixed number of forward/backward passes through the network per iteration.
4 code implementations • 18 Feb 2019 • Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, Alexey Kurakin
Correctly evaluating defenses against adversarial examples has proven to be extremely difficult.
2 code implementations • 6 Aug 2018 • Wieland Brendel, Jonas Rauber, Alexey Kurakin, Nicolas Papernot, Behar Veliqi, Marcel Salathé, Sharada P. Mohanty, Matthias Bethge
The NIPS 2018 Adversarial Vision Challenge is a competition to facilitate measurable progress towards robust machine vision models and more generally applicable adversarial attacks.
1 code implementation • 31 Mar 2018 • Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe
To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.
4 code implementations • NeurIPS 2018 • Harini Kannan, Alexey Kurakin, Ian Goodfellow
In this paper, we develop improved techniques for defending against adversarial examples at scale.
11 code implementations • ICLR 2018 • Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel
We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss.
7 code implementations • 4 Nov 2016 • Alexey Kurakin, Ian Goodfellow, Samy Bengio
Adversarial examples are malicious inputs designed to fool machine learning models.
13 code implementations • 3 Oct 2016 • Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long, Patrick McDaniel
An adversarial example library for constructing attacks, building defenses, and benchmarking both
6 code implementations • 8 Jul 2016 • Alexey Kurakin, Ian Goodfellow, Samy Bengio
Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier.