Interpretable Machine Learning

189 papers with code • 1 benchmarks • 4 datasets

The goal of Interpretable Machine Learning is to allow oversight and understanding of machine-learned decisions. Much of the work in Interpretable Machine Learning has come in the form of devising methods to better explain the predictions of machine learning models.

Source: Assessing the Local Interpretability of Machine Learning Models

Libraries

Use these libraries to find Interpretable Machine Learning models and implementations
6 papers
4,580
4 papers
1,292
3 papers
21,670
3 papers
21,669
See all 10 libraries.

Most implemented papers

Interpretable Explanations of Black Boxes by Meaningful Perturbation

ruthcfong/perturb_explanations ICCV 2017

As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions.

Interpretable machine learning: definitions, methods, and applications

csinva/imodels 14 Jan 2019

Official code for using / reproducing ACD (ICLR 2019) from the paper "Hierarchical interpretations for neural network predictions" https://arxiv. org/abs/1806. 05337

Neural Additive Models: Interpretable Machine Learning with Neural Nets

lemeln/nam NeurIPS 2021

They perform similarly to existing state-of-the-art generalized additive models in accuracy, but are more flexible because they are based on neural nets instead of boosted trees.

Drop Clause: Enhancing Performance, Interpretability and Robustness of the Tsetlin Machine

anonymous-2491/drop-clause-interpretable-tm 30 May 2021

In this article, we introduce a novel variant of the Tsetlin machine (TM) that randomly drops clauses, the key learning elements of a TM.

ProtoAttend: Attention-Based Prototypical Learning

google-research/google-research 17 Feb 2019

We propose a novel inherently interpretable machine learning method that bases decisions on few relevant examples that we call prototypes.

Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees

csinva/disentangled-attribution-curves 18 May 2019

Tree ensembles, such as random forests and AdaBoost, are ubiquitous machine learning models known for achieving strong predictive performance across a wide variety of domains.

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead

csinva/imodels 26 Nov 2018

Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains.

Explaining a black-box using Deep Variational Information Bottleneck Approach

SeojinBang/VIBI 19 Feb 2019

Briefness and comprehensiveness are necessary in order to provide a large amount of information concisely when explaining a black-box decision system.

Improving performance of deep learning models with axiomatic attribution priors and expected gradients

suinleelab/attributionpriors ICLR 2020

Recent research has demonstrated that feature attribution methods for deep networks can themselves be incorporated into training; these attribution priors optimize for a model whose attributions have certain desirable properties -- most frequently, that particular features are important or unimportant.

Explaining Groups of Points in Low-Dimensional Representations

GDPlumb/ELDR ICML 2020

A common workflow in data exploration is to learn a low-dimensional representation of the data, identify groups of points in that representation, and examine the differences between the groups to determine what they represent.