Search Results for author: Khalid Saifullah

Found 5 papers, 3 papers with code

Coercing LLMs to do and reveal (almost) anything

1 code implementation • 21 Feb 2024 • Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein

It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements.

Paper
Code

Seeing in Words: Learning to Classify through Language Bottlenecks

no code implementations • 29 Jun 2023 • Khalid Saifullah, Yuxin Wen, Jonas Geiping, Micah Goldblum, Tom Goldstein

Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks.

Paper
Add Code

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

1 code implementation • 23 Jun 2023 • Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative.

Chatbot Language Modelling

107

Paper
Code

On the Reliability of Watermarks for Large Language Models

1 code implementation • 7 Jun 2023 • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein

We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document, and we compare the robustness of watermarking to other kinds of detectors.

447

Paper
Code

Learning UI-to-Code Reverse Generator Using Visual Critic Without Rendering

no code implementations • 24 May 2023 • Davit Soselia, Khalid Saifullah, Tianyi Zhou

We evaluate the UI-to-Code performance using a combination of automated metrics such as MSE, BLEU, IoU, and a novel htmlBLEU score.

Code Generation Decoder +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.