1 code implementation • 16 Feb 2024 • Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio P. Calmon, Himabindu Lakkaraju
CLIP embeddings have demonstrated remarkable performance across a wide range of computer vision tasks.
1 code implementation • 6 Sep 2023 • Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, Himabindu Lakkaraju
We defend against three attack modes: i) adversarial suffix, where an adversarial sequence is appended at the end of a harmful prompt; ii) adversarial insertion, where the adversarial sequence is inserted anywhere in the middle of the prompt; and iii) adversarial infusion, where adversarial tokens are inserted at arbitrary positions in the prompt, not necessarily as a contiguous block.
1 code implementation • NeurIPS 2023 • Usha Bhalla, Suraj Srinivas, Himabindu Lakkaraju
This strategy naturally combines the ease of use of post hoc explanations with the faithfulness of inherently interpretable models.
no code implementations • 26 Jul 2023 • Tessa Han, Suraj Srinivas, Himabindu Lakkaraju
These estimators linearize models in the local region around an input and analytically compute the robustness of the resulting linear models.
no code implementations • 11 Jun 2023 • Anna P. Meyer, Dan Ley, Suraj Srinivas, Himabindu Lakkaraju
To this end, we conduct rigorous theoretical analysis to demonstrate that model curvature, weight decay parameters while training, and the magnitude of the dataset shift are key factors that determine the extent of explanation (in)stability.
no code implementations • 9 Jun 2023 • Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju
This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy, which arises due to the existence of multiple (nearly) equally well-performing models for a given dataset and task.
no code implementations • 3 Jun 2023 • Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju
We introduce a method for computing scores for each word in the prompt; these scores represent its influence on biases in the model's output.
1 code implementation • NeurIPS 2023 • Suraj Srinivas, Sebastian Bordt, Hima Lakkaraju
Quantifying the perceptual alignment of model gradients via their similarity with the gradients of generative models, we show that off-manifold robustness correlates well with perceptual alignment.
2 code implementations • 14 Jun 2022 • Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, Francois Fleuret
To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers.
1 code implementation • 2 Jun 2022 • Tessa Han, Suraj Srinivas, Himabindu Lakkaraju
By bringing diverse explanation methods into a common framework, this work (1) advances the conceptual understanding of these methods, revealing their shared local function approximation objective, properties, and relation to one another, and (2) guides the use of these methods in practice, providing a principled approach to choose among methods and paving the way for the creation of new ones.
1 code implementation • 9 Mar 2022 • Marwa El Halabi, Suraj Srinivas, Simon Lacoste-Julien
Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance.
no code implementations • 2 Feb 2022 • Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort
Current methods for pruning neural network weights iteratively apply magnitude-based pruning on the model weights and re-train the resulting model to recover lost accuracy.
1 code implementation • ICLR 2021 • Suraj Srinivas, Francois Fleuret
This leads us to hypothesize that the highly structured and explanatory nature of input-gradients may be due to the alignment of this class-conditional model $p_{\theta}(x \mid y)$ with that of the ground truth data distribution $p_{\text{data}} (x \mid y)$.
2 code implementations • NeurIPS 2019 • Suraj Srinivas, Francois Fleuret
Our experiments reveal that our method explains model behaviour correctly, and more comprehensively than other methods in the literature.
no code implementations • ICML 2018 • Suraj Srinivas, Francois Fleuret
We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation.
no code implementations • 21 Jul 2017 • Akshayvarun Subramanya, Suraj Srinivas, R. Venkatesh Babu
State-of-the-art Deep Neural Networks can be easily fooled into providing incorrect high-confidence predictions for images with small amounts of adversarial noise.
no code implementations • 21 Nov 2016 • Suraj Srinivas, Akshayvarun Subramanya, R. Venkatesh Babu
Deep neural networks with lots of parameters are typically used for large-scale computer vision tasks such as image classification.
no code implementations • 21 Nov 2016 • Suraj Srinivas, R. Venkatesh Babu
One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters.
no code implementations • 17 Nov 2016 • Lokesh Boominathan, Suraj Srinivas, R. Venkatesh Babu
This is inspired by the neuro-scientific concept of mental rotation, which humans use to compare pairs of rotated objects.
no code implementations • 25 Jan 2016 • Suraj Srinivas, Ravi Kiran Sarvadevabhatla, Konda Reddy Mopuri, Nikita Prabhu, Srinivas S. S. Kruthiventi, R. Venkatesh Babu
With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective.
no code implementations • 17 Nov 2015 • Suraj Srinivas, R. Venkatesh Babu
In this work, we introduce the problem of architecture-learning, i. e; learning the architecture of a neural network along with weights.
no code implementations • 22 Jul 2015 • Suraj Srinivas, R. Venkatesh Babu
Our experiments in pruning the densely connected layers show that we can remove upto 85\% of the total parameters in an MNIST-trained network, and about 35\% for AlexNet without significantly affecting performance.