Search Results for author: Shashwat Goel

Found 7 papers, 3 papers with code

Corrective Machine Unlearning

1 code implementation21 Feb 2024 Shashwat Goel, Ameya Prabhu, Philip Torr, Ponnurangam Kumaraguru, Amartya Sanyal

We hope our work spurs research towards developing better methods for corrective unlearning and offers practitioners a new strategy to handle data integrity challenges arising from web-scale training.

Machine Unlearning

Representation Engineering: A Top-Down Approach to AI Transparency

1 code implementation2 Oct 2023 Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

Question Answering

Proportional Aggregation of Preferences for Sequential Decision Making

no code implementations26 Jun 2023 Nikhil Chandak, Shashwat Goel, Dominik Peters

In each round, a decision rule must choose a decision from a set of alternatives where each voter reports which of these alternatives they approve.

Decision Making

Low impact agency: review and discussion

no code implementations6 Mar 2023 Danilo Naiff, Shashwat Goel

Powerful artificial intelligence poses an existential threat if the AI decides to drastically change the world in pursuit of its goals.

Towards Adversarial Evaluations for Inexact Machine Unlearning

3 code implementations17 Jan 2022 Shashwat Goel, Ameya Prabhu, Amartya Sanyal, Ser-Nam Lim, Philip Torr, Ponnurangam Kumaraguru

Machine Learning models face increased concerns regarding the storage of personal user data and adverse impacts of corrupted data like backdoors or systematic bias.

Machine Unlearning Memorization

From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation

no code implementations27 Aug 2021 Shashwat Goel, Kunwar Shaanjeet Singh Grover

This paper describes an approach used to generate new translations using raw bilingual dictionaries as part of the 4th Task Inference Across Dictionaries (TIAD 2021) shared task.

Cannot find the paper you are looking for? You can Submit a new open access paper.