Search Results for author: Vikash Sehwag

Found 22 papers, 10 papers with code

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

1 code implementation • 28 Mar 2024 • Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work -- which align with OpenAI's usage policies; (3) a standardized evaluation framework that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard that tracks the performance of attacks and defenses for various LLMs.

Paper
Code

Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection

no code implementations • 23 Mar 2024 • Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin

In this paper, we propose WaterMark Detection (WMD), the first invisible watermark detection method under a black-box and annotation-free setting.

Paper
Add Code

Scaling Compute Is Not All You Need for Adversarial Robustness

no code implementations • 20 Dec 2023 • Edoardo Debenedetti, Zishen Wan, Maksym Andriushchenko, Vikash Sehwag, Kshitij Bhardwaj, Bhavya Kailkhura

Finally, we make our benchmarking framework (built on top of \texttt{timm}~\citep{rw2019timm}) publicly available to facilitate future analysis in efficient robust deep learning.

Adversarial Robustness Benchmarking

Paper
Add Code

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

no code implementations • 21 Feb 2023 • Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal

Using our framework, we present the first leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which captures performance across attack types and attack strengths.

Benchmarking

Paper
Add Code

Extracting Training Data from Diffusion Models

1 code implementation • 30 Jan 2023 • Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images.

Privacy Preserving

Paper
Code

Uncovering Adversarial Risks of Test-Time Adaptation

no code implementations • 29 Jan 2023 • Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts.

Test-time Adaptation

Paper
Add Code

DP-RAFT: A Differentially Private Recipe for Accelerated Fine-Tuning

no code implementations • 8 Dec 2022 • Ashwinee Panda, Xinyu Tang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

A major direction in differentially private machine learning is differentially private fine-tuning: pretraining a model on a source of "public data" and transferring the extracted features to downstream tasks.

Image Classification

Paper
Add Code

A Light Recipe to Train Robust Vision Transformers

1 code implementation • 15 Sep 2022 • Edoardo Debenedetti, Vikash Sehwag, Prateek Mittal

Additionally, investigating the reasons for the robustness of our models, we show that it is easier to generate strong attacks during training when using our recipe and that this leads to better robustness at test time.

Adversarial Robustness Data Augmentation +1

Paper
Code

Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation

no code implementations • 22 Jul 2022 • Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications.

Data Augmentation Image Classification +3

Paper
Add Code

Understanding Robust Learning through the Lens of Representation Similarities

1 code implementation • 20 Jun 2022 • Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Y. Zhao, Prateek Mittal, Haitao Zheng

Representation learning, i. e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs).

Representation Learning

Paper
Code

Generating High Fidelity Data from Low-density Regions using Diffusion Models

no code implementations • CVPR 2022 • Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer

We observe that uniform sampling from diffusion models predominantly samples from high-density regions of the data manifold.

Vocal Bursts Intensity Prediction

Paper
Add Code

Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

2 code implementations • ICLR 2022 • Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal

We circumvent this challenge by using additional data from proxy distributions learned by advanced generative models.

Adversarial Robustness Image Classification

Paper
Code

Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries

1 code implementation • 16 Apr 2021 • Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal

In particular, it is critical to determine classifier-agnostic bounds on the training loss to establish when learning is possible.

Paper
Code

SSD: A Unified Framework for Self-Supervised Outlier Detection

3 code implementations • ICLR 2021 • Vikash Sehwag, Mung Chiang, Prateek Mittal

We demonstrate that SSD outperforms most existing detectors based on unlabeled data by a large margin.

Ranked #2 on Anomaly Detection on Anomaly Detection on Unlabeled CIFAR-10 vs LSUN (Fix)

Anomaly Detection Outlier Detection +3

125

Paper
Code

RobustBench: a standardized adversarial robustness benchmark

1 code implementation • 19 Oct 2020 • Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models.

Adversarial Robustness Benchmarking +3

597

Paper
Code

Fast-Convergent Federated Learning

no code implementations • 26 Jul 2020 • Hung T. Nguyen, Vikash Sehwag, Seyyedali Hosseinalipour, Christopher G. Brinton, Mung Chiang, H. Vincent Poor

In this paper, we propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training to optimize the expected convergence speed.

BIG-bench Machine Learning Federated Learning

Paper
Add Code

A Critical Evaluation of Open-World Machine Learning

no code implementations • 8 Jul 2020 • Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal

With our evaluation across 6 OOD detectors, we find that the choice of in-distribution data, model architecture and OOD data have a strong impact on OOD detection performance, inducing false positive rates in excess of $70\%$.

BIG-bench Machine Learning Out of Distribution (OOD) Detection

Paper
Add Code

Time for a Background Check! Uncovering the impact of Background Features on Deep Neural Networks

no code implementations • 24 Jun 2020 • Vikash Sehwag, Rajvardhan Oak, Mung Chiang, Prateek Mittal

With increasing expressive power, deep neural networks have significantly improved the state-of-the-art on image classification datasets, such as ImageNet.

Image Classification

Paper
Add Code

PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

2 code implementations • 17 May 2020 • Chong Xiang, Arjun Nitin Bhagoji, Vikash Sehwag, Prateek Mittal

In this paper, we propose a general defense framework called PatchGuard that can achieve high provable robustness while maintaining high clean accuracy against localized adversarial patches.

Paper
Code

HYDRA: Pruning Adversarially Robust Neural Networks

4 code implementations • NeurIPS 2020 • Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

We demonstrate that our approach, titled HYDRA, achieves compressed networks with state-of-the-art benign and robust accuracy, simultaneously.

Network Pruning

139

Paper
Code

Towards Compact and Robust Deep Neural Networks

no code implementations • 14 Jun 2019 • Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

In this work, we rigorously study the extension of network pruning strategies to preserve both benign accuracy and robustness of a network.

Adversarial Robustness Network Pruning

Paper
Add Code

Better the Devil you Know: An Analysis of Evasion Attacks using Out-of-Distribution Adversarial Examples

no code implementations • 5 May 2019 • Vikash Sehwag, Arjun Nitin Bhagoji, Liwei Song, Chawin Sitawarin, Daniel Cullina, Mung Chiang, Prateek Mittal

A large body of recent work has investigated the phenomenon of evasion attacks using adversarial examples for deep learning systems, where the addition of norm-bounded perturbations to the test inputs leads to incorrect output classification.

Autonomous Driving General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.