Search Results for author: Alexey Kurakin

Found 18 papers, 12 papers with code

DART: A Principled Approach to Adversarially Robust Unsupervised Domain Adaptation

no code implementations16 Feb 2024 Yunjuan Wang, Hussein Hazimeh, Natalia Ponomareva, Alexey Kurakin, Ibrahim Hammoud, Raman Arora

To address this challenge, we first establish a generalization bound for the adversarial target loss, which consists of (i) terms related to the loss on the data, and (ii) a measure of worst-case domain divergence.

Adversarial Robustness Unsupervised Domain Adaptation

Harnessing large-language models to generate private synthetic text

no code implementations2 Jun 2023 Alexey Kurakin, Natalia Ponomareva, Umar Syed, Liam MacDermed, Andreas Terzis

An alternative approach, which this paper studies, is to use a sensitive dataset to generate synthetic data that is differentially private with respect to the original data, and then non-privately training a model on the synthetic data.

Language Modelling

RETVec: Resilient and Efficient Text Vectorizer

1 code implementation NeurIPS 2023 Elie Bursztein, Marina Zhang, Owen Vallis, Xinyu Jia, Alexey Kurakin

The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks.

Adversarial Text Metric Learning +1

Publishing Efficient On-device Models Increases Adversarial Vulnerability

no code implementations28 Dec 2022 Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.

Quantization

Differentially Private Image Classification from Features

1 code implementation24 Nov 2022 Harsh Mehta, Walid Krichene, Abhradeep Thakurta, Alexey Kurakin, Ashok Cutkosky

We find that linear regression is much more effective than logistic regression from both privacy and computational aspects, especially at stricter epsilon values ($\epsilon < 1$).

Classification Image Classification +3

Large Scale Transfer Learning for Differentially Private Image Classification

no code implementations6 May 2022 Harsh Mehta, Abhradeep Thakurta, Alexey Kurakin, Ashok Cutkosky

Moreover, by systematically comparing private and non-private models across a range of large batch sizes, we find that similar to non-private setting, choice of optimizer can further improve performance substantially with DP.

Classification Image Classification +1

Toward Training at ImageNet Scale with Differential Privacy

1 code implementation28 Jan 2022 Alexey Kurakin, Shuang Song, Steve Chien, Roxana Geambasu, Andreas Terzis, Abhradeep Thakurta

Despite a rich literature on how to train ML models with differential privacy, it remains extremely challenging to train real-life, large neural networks with both reasonable accuracy and privacy.

Image Classification with Differential Privacy

Handcrafted Backdoors in Deep Neural Networks

no code implementations8 Jun 2021 Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

When machine learning training is outsourced to third parties, $backdoor$ $attacks$ become practical as the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model.

Backdoor Attack

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

2 code implementations NeurIPS 2020 Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, aditi raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

In this work, we propose a first-order dual SDP algorithm that (1) requires memory only linear in the total number of network activations, (2) only requires a fixed number of forward/backward passes through the network per iteration.

Adversarial Vision Challenge

2 code implementations6 Aug 2018 Wieland Brendel, Jonas Rauber, Alexey Kurakin, Nicolas Papernot, Behar Veliqi, Marcel Salathé, Sharada P. Mohanty, Matthias Bethge

The NIPS 2018 Adversarial Vision Challenge is a competition to facilitate measurable progress towards robust machine vision models and more generally applicable adversarial attacks.

Adversarial Attacks and Defences Competition

1 code implementation31 Mar 2018 Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

Adversarial Logit Pairing

4 code implementations NeurIPS 2018 Harini Kannan, Alexey Kurakin, Ian Goodfellow

In this paper, we develop improved techniques for defending against adversarial examples at scale.

Ensemble Adversarial Training: Attacks and Defenses

11 code implementations ICLR 2018 Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss.

Adversarial Machine Learning at Scale

7 code implementations4 Nov 2016 Alexey Kurakin, Ian Goodfellow, Samy Bengio

Adversarial examples are malicious inputs designed to fool machine learning models.

BIG-bench Machine Learning

Adversarial examples in the physical world

6 code implementations8 Jul 2016 Alexey Kurakin, Ian Goodfellow, Samy Bengio

Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.