Search Results for author: Mansi Phute

Found 3 papers, 3 papers with code

LLM Attributor: Interactive Visual Attribution for LLM Generation

1 code implementation1 Apr 2024 Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, Shengyun Peng, Mansi Phute, Duen Horng Chau, Minsuk Kahng

Our library offers a new way to quickly attribute an LLM's text generation to training data points to inspect model behaviors, enhance its trustworthiness, and compare model-generated text with user-provided text.

Attribute Text Generation

Robust Principles: Architectural Design Principles for Adversarially Robust CNNs

1 code implementation30 Aug 2023 Shengyun Peng, Weilin Xu, Cory Cornelius, Matthew Hull, Kevin Li, Rahul Duggal, Mansi Phute, Jason Martin, Duen Horng Chau

Our research aims to unify existing works' diverging opinions on how architectural components affect the adversarial robustness of CNNs.

Adversarial Robustness

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

1 code implementation14 Aug 2023 Mansi Phute, Alec Helbling, Matthew Hull, Shengyun Peng, Sebastian Szyller, Cory Cornelius, Duen Horng Chau

We test LLM Self Defense on GPT 3. 5 and Llama 2, two of the current most prominent LLMs against various types of attacks, such as forcefully inducing affirmative responses to prompts and prompt engineering attacks.

Language Modelling Large Language Model +2

Cannot find the paper you are looking for? You can Submit a new open access paper.