Search Results for author: Suraj Srinivas

Found 22 papers, 9 papers with code

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

1 code implementation16 Feb 2024 Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio P. Calmon, Himabindu Lakkaraju

CLIP embeddings have demonstrated remarkable performance across a wide range of computer vision tasks.

Model Editing

Certifying LLM Safety against Adversarial Prompting

1 code implementation6 Sep 2023 Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, Himabindu Lakkaraju

We defend against three attack modes: i) adversarial suffix, where an adversarial sequence is appended at the end of a harmful prompt; ii) adversarial insertion, where the adversarial sequence is inserted anywhere in the middle of the prompt; and iii) adversarial infusion, where adversarial tokens are inserted at arbitrary positions in the prompt, not necessarily as a contiguous block.

Adversarial Attack Language Modelling

Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability

1 code implementation NeurIPS 2023 Usha Bhalla, Suraj Srinivas, Himabindu Lakkaraju

This strategy naturally combines the ease of use of post hoc explanations with the faithfulness of inherently interpretable models.

Attribute

Efficient Estimation of Average-Case Robustness for Multi-Class Classification

no code implementations26 Jul 2023 Tessa Han, Suraj Srinivas, Himabindu Lakkaraju

These estimators linearize models in the local region around an input and analytically compute the robustness of the resulting linear models.

Multi-class Classification

On Minimizing the Impact of Dataset Shifts on Actionable Explanations

no code implementations11 Jun 2023 Anna P. Meyer, Dan Ley, Suraj Srinivas, Himabindu Lakkaraju

To this end, we conduct rigorous theoretical analysis to demonstrate that model curvature, weight decay parameters while training, and the magnitude of the dataset shift are key factors that determine the extent of explanation (in)stability.

Consistent Explanations in the Face of Model Indeterminacy via Ensembling

no code implementations9 Jun 2023 Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju

This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy, which arises due to the existence of multiple (nearly) equally well-performing models for a given dataset and task.

Word-Level Explanations for Analyzing Bias in Text-to-Image Models

no code implementations3 Jun 2023 Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju

We introduce a method for computing scores for each word in the prompt; these scores represent its influence on biases in the model's output.

Sentence

Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness

1 code implementation NeurIPS 2023 Suraj Srinivas, Sebastian Bordt, Hima Lakkaraju

Quantifying the perceptual alignment of model gradients via their similarity with the gradients of generative models, we show that off-manifold robustness correlates well with perceptual alignment.

Denoising Image Generation

Efficiently Training Low-Curvature Neural Networks

2 code implementations14 Jun 2022 Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, Francois Fleuret

To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers.

Adversarial Robustness

Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations

1 code implementation2 Jun 2022 Tessa Han, Suraj Srinivas, Himabindu Lakkaraju

By bringing diverse explanation methods into a common framework, this work (1) advances the conceptual understanding of these methods, revealing their shared local function approximation objective, properties, and relation to one another, and (2) guides the use of these methods in practice, providing a principled approach to choose among methods and paving the way for the creation of new ones.

Data-Efficient Structured Pruning via Submodular Optimization

1 code implementation9 Mar 2022 Marwa El Halabi, Suraj Srinivas, Simon Lacoste-Julien

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance.

Cyclical Pruning for Sparse Neural Networks

no code implementations2 Feb 2022 Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort

Current methods for pruning neural network weights iteratively apply magnitude-based pruning on the model weights and re-train the resulting model to recover lost accuracy.

Rethinking the Role of Gradient-Based Attribution Methods for Model Interpretability

1 code implementation ICLR 2021 Suraj Srinivas, Francois Fleuret

This leads us to hypothesize that the highly structured and explanatory nature of input-gradients may be due to the alignment of this class-conditional model $p_{\theta}(x \mid y)$ with that of the ground truth data distribution $p_{\text{data}} (x \mid y)$.

Open-Ended Question Answering

Full-Gradient Representation for Neural Network Visualization

2 code implementations NeurIPS 2019 Suraj Srinivas, Francois Fleuret

Our experiments reveal that our method explains model behaviour correctly, and more comprehensively than other methods in the literature.

Interpretable Machine Learning

Knowledge Transfer with Jacobian Matching

no code implementations ICML 2018 Suraj Srinivas, Francois Fleuret

We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation.

Transfer Learning

Confidence estimation in Deep Neural networks via density modelling

no code implementations21 Jul 2017 Akshayvarun Subramanya, Suraj Srinivas, R. Venkatesh Babu

State-of-the-art Deep Neural Networks can be easily fooled into providing incorrect high-confidence predictions for images with small amounts of adversarial noise.

Training Sparse Neural Networks

no code implementations21 Nov 2016 Suraj Srinivas, Akshayvarun Subramanya, R. Venkatesh Babu

Deep neural networks with lots of parameters are typically used for large-scale computer vision tasks such as image classification.

General Classification Image Classification

Generalized Dropout

no code implementations21 Nov 2016 Suraj Srinivas, R. Venkatesh Babu

One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters.

Bayesian Inference

Compensating for Large In-Plane Rotations in Natural Images

no code implementations17 Nov 2016 Lokesh Boominathan, Suraj Srinivas, R. Venkatesh Babu

This is inspired by the neuro-scientific concept of mental rotation, which humans use to compare pairs of rotated objects.

Bayesian Optimization Image Retrieval +1

A Taxonomy of Deep Convolutional Neural Nets for Computer Vision

no code implementations25 Jan 2016 Suraj Srinivas, Ravi Kiran Sarvadevabhatla, Konda Reddy Mopuri, Nikita Prabhu, Srinivas S. S. Kruthiventi, R. Venkatesh Babu

With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective.

Learning Neural Network Architectures using Backpropagation

no code implementations17 Nov 2015 Suraj Srinivas, R. Venkatesh Babu

In this work, we introduce the problem of architecture-learning, i. e; learning the architecture of a neural network along with weights.

Data-free parameter pruning for Deep Neural Networks

no code implementations22 Jul 2015 Suraj Srinivas, R. Venkatesh Babu

Our experiments in pruning the densely connected layers show that we can remove upto 85\% of the total parameters in an MNIST-trained network, and about 35\% for AlexNet without significantly affecting performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.