no code implementations • 8 Oct 2023 • Akshaj Kumar Veldanda, Fabian Grob, Shailja Thakur, Hammond Pearce, Benjamin Tan, Ramesh Karri, Siddharth Garg
We replicate this experiment on state-of-art LLMs (GPT-3. 5, Bard, Claude and Llama) to evaluate bias (or lack thereof) on gender, race, maternity status, pregnancy status, and political affiliation.
no code implementations • 18 Jul 2023 • Swagnik Roychoudhury, Akshaj Kumar Veldanda
Our results show that the backdoor attacks can be effectively used to identify vulnerabilities in spam filters and suggest the need for ongoing monitoring and improvement in this area.
no code implementations • 2 Feb 2023 • Akshaj Kumar Veldanda, Ivan Brugere, Sanghamitra Dutta, Alan Mishler, Siddharth Garg
Recent work has sought to train fair models without sensitive attributes on training data.
no code implementations • 29 Jun 2022 • Akshaj Kumar Veldanda, Ivan Brugere, Jiahao Chen, Sanghamitra Dutta, Alan Mishler, Siddharth Garg
We further show that MinDiff optimization is very sensitive to choice of batch size in the under-parameterized regime.
no code implementations • 4 Nov 2020 • Hao Fu, Akshaj Kumar Veldanda, Prashanth Krishnamurthy, Siddharth Garg, Farshad Khorrami
This paper proposes a new defense against neural network backdooring attacks that are maliciously trained to mispredict in the presence of attacker-chosen triggers.
1 code implementation • 19 Feb 2020 • Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, Siddharth Garg
This paper proposes a novel two-stage defense (NNoculation) against backdoored neural networks (BadNets) that, repairs a BadNet both pre-deployment and online in response to backdoored test inputs encountered in the field.