Search Results for author: Norman Mu

Found 11 papers, 8 papers with code

Generative AI Security: Challenges and Countermeasures

no code implementations20 Feb 2024 Banghua Zhu, Norman Mu, Jiantao Jiao, David Wagner

Generative AI's expanding footprint across numerous industries has led to both excitement and increased scrutiny.

PAL: Proxy-Guided Black-Box Attack on Large Language Models

1 code implementation15 Feb 2024 Chawin Sitawarin, Norman Mu, David Wagner, Alexandre Araujo

In this work, we introduce the Proxy-Guided Attack on LLMs (PAL), the first optimization-based attack on LLMs in a black-box query-only setting.

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

1 code implementation6 Feb 2024 Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods.

Mark My Words: Analyzing and Evaluating Language Model Watermarks

1 code implementation1 Dec 2023 Julien Piet, Chawin Sitawarin, Vivian Fang, Norman Mu, David Wagner

The capabilities of large language models have grown significantly in recent years and so too have concerns about their misuse.

Language Modelling

Can LLMs Follow Simple Rules?

1 code implementation6 Nov 2023 Norman Mu, Sarah Chen, Zifan Wang, Sizhe Chen, David Karamardian, Lulwa Aljeraisy, Basel Alomair, Dan Hendrycks, David Wagner

As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner.

SLIP: Self-supervision meets Language-Image Pre-training

1 code implementation23 Dec 2021 Norman Mu, Alexander Kirillov, David Wagner, Saining Xie

Across ImageNet and a battery of additional datasets, we find that SLIP improves accuracy by a large margin.

Multi-Task Learning Representation Learning +1

A Rigorous Evaluation of Real-World Distribution Shifts

no code implementations1 Jan 2021 Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer

Motivated by this, we introduce a new data augmentation method which advances the state-of-the-art and outperforms models pretrained with 1000x more labeled data.

Data Augmentation

MNIST-C: A Robustness Benchmark for Computer Vision

2 code implementations5 Jun 2019 Norman Mu, Justin Gilmer

We introduce the MNIST-C dataset, a comprehensive suite of 15 corruptions applied to the MNIST test set, for benchmarking out-of-distribution robustness in computer vision.

Adversarial Robustness Benchmarking

Parameter Re-Initialization through Cyclical Batch Size Schedules

no code implementations4 Dec 2018 Norman Mu, Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael Mahoney

We demonstrate the ability of our method to improve language modeling performance by up to 7. 91 perplexity and reduce training iterations by up to $61\%$, in addition to its flexibility in enabling snapshot ensembling and use with adversarial training.

General Classification Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.