Search Results for author: Mansi Phute

Found 3 papers, 3 papers with code

LLM Attributor: Interactive Visual Attribution for LLM Generation

1 code implementation • 1 Apr 2024 • Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, Shengyun Peng, Mansi Phute, Duen Horng Chau, Minsuk Kahng

Our library offers a new way to quickly attribute an LLM's text generation to training data points to inspect model behaviors, enhance its trustworthiness, and compare model-generated text with user-provided text.

Attribute Text Generation

Paper
Code

Robust Principles: Architectural Design Principles for Adversarially Robust CNNs

1 code implementation • 30 Aug 2023 • Shengyun Peng, Weilin Xu, Cory Cornelius, Matthew Hull, Kevin Li, Rahul Duggal, Mansi Phute, Jason Martin, Duen Horng Chau

Our research aims to unify existing works' diverging opinions on how architectural components affect the adversarial robustness of CNNs.

Adversarial Robustness

Paper
Code

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

1 code implementation • 14 Aug 2023 • Mansi Phute, Alec Helbling, Matthew Hull, Shengyun Peng, Sebastian Szyller, Cory Cornelius, Duen Horng Chau

We test LLM Self Defense on GPT 3. 5 and Llama 2, two of the current most prominent LLMs against various types of attacks, such as forcefully inducing affirmative responses to prompts and prompt engineering attacks.

Language Modelling Large Language Model +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.