• Stars
    star
    184
  • Rank 208,006 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 4 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A repository for explaining feature attributions and feature interactions in deep neural networks.

Path Explain

A repository for explaining feature importances and feature interactions in deep neural networks using path attribution methods.

This repository contains tools to interpret and explain machine learning models using Integrated Gradients and Expected Gradients. In addition, it contains code to explain interactions in deep networks using Integrated Hessians and Expected Hessians - methods that we introduced in our most recent paper: "Explaining Explanations: Axiomatic Feature Interactions for Deep Networks". If you use our work to explain your networks, please cite this paper.

@article{janizek2020explaining,
  author  = {Joseph D. Janizek and Pascal Sturmfels and Su-In Lee},
  title   = {Explaining Explanations: Axiomatic Feature Interactions for Deep Networks},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {104},
  pages   = {1-54},
  url     = {http://jmlr.org/papers/v22/20-1223.html}
}

This repository contains two important directories: the path_explain directory, which contains the packages used to interpret and explain machine learning models, and the examples directory, which contains many examples using the path_explain module to explain different models on different data types.

Installation

The easiest way to install this package is by using pip:

pip install path-explain

Alternatively, you can clone this repository to re-run and explore the examples provided.

Compatibility

This package was written to support TensorFlow 2.0 (in eager execution mode) with Python 3. We have no current plans to support earlier versions of TensorFlow or Python.

API

Although we don't yet have formal API documentation, the underlying code does a pretty good job at explaining the API. See the code for generating attributions and interactions to better understand what the arguments to these functions mean.

Examples

For a simple, quick example to get started using this repository, see the example_usage.ipynb notebook in the top-level directory of this repository. It gives an overview of the functionality provided by this repository. For more advanced examples, keep reading on.

Tabular Data using Expected Gradients and Expected Hessians

Our repository can easily be adapted to explain attributions and interactions learned on tabular data.

# other import statements...
from path_explain import PathExplainerTF, scatter_plot, summary_plot

### Code to train a model would go here
x_train, y_train, x_test, y_test = datset()
model = ...
model.fit(x_train, y_train, ...)
###

### Generating attributions using expected gradients
explainer = PathExplainerTF(model)
attributions = explainer.attributions(inputs=x_test,
                                      baseline=x_train,
                                      batch_size=100,
                                      num_samples=200,
                                      use_expectation=True,
                                      output_indices=0)
###

### Generating interactions using expected hessians
interactions = explainer.interactions(inputs=x_test,
                                      baseline=x_train,
                                      batch_size=100,
                                      num_samples=200,
                                      use_expectation=True,
                                      output_indices=0)
###

Once we've generated attributions and interactions, we can use the provided plotting modules to help visualize them. First we plot a summary of the top features and their attribution values:

### First we need a list of strings denoting the name of each feature
feature_names = ...
###

summary_plot(attributions=attributions,
             feature_values=x_test,
             feature_names=feature_names,
             plot_top_k=10)

Heart Disease Summary Plot

Second, we plot an interaction our model has learned between maximum achieved heart rate and gender:

scatter_plot(attributions=attributions,
             feature_values=x_test,
             feature_index='max. achieved heart rate',
             interactions=interactions,
             color_by='is male',
             feature_names=feature_names,
             scale_y_ind=True)

Interaction: Heart Rate and Gender

The model used to generate the above interactions is a two layer neural network trained on the UCI Heart Disease Dataset. Interactions learned by this model were featured in our paper. To learn more about this particular model and the experimental setup, see the notebook used to train and explain the model.

Explaining an NLP model using Integrated Gradients and Integrated Hessians

As discussed in our paper, we can use Integrated Hessians to get interactions in language models. We explain a transformer from the HuggingFace Transformers Repository.

from transformers import DistilBertTokenizer, TFDistilBertForSequenceClassification, \
                         DistilBertConfig, glue_convert_examples_to_features, \
                         glue_processors

# This is a custom explainer to explain huggingface models
from path_explain import EmbeddingExplainerTF, text_plot, matrix_interaction_plot, bar_interaction_plot

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
config = DistilBertConfig.from_pretrained('distilbert-base-uncased', num_labels=num_labels)
model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', config=config)

### Some custom code to fine-tune the model on a sentiment analysis task...
max_length = 128
data, info = tensorflow_datasets.load('glue/sst-2', with_info=True)
train_dataset = glue_convert_examples_to_features(data['train'],
                                                  tokenizer,
                                                  max_length,
                                                  'sst-2)
valid_dataset = glue_convert_examples_to_features(data['validation'],
                                                  tokenizer,
                                                  max_length,
                                                  'sst-2')
...
### we won't include the whole fine-tuning code. See the HuggingFace repository for more.

### Here we define functions that represent two pieces of the model:
### embedding and prediction
def embedding_model(batch_ids):
    batch_embedding = model.distilbert.embeddings(batch_ids)
    return batch_embedding

def prediction_model(batch_embedding):
    # Note: this isn't exactly the right way to use the attention mask.
    # It should actually indicate which words are real words. This
    # makes the coding easier however, and the output is fairly similar,
    # so it suffices for this tutorial.
    attention_mask = tf.ones(batch_embedding.shape[:2])
    attention_mask = tf.cast(attention_mask, dtype=tf.float32)
    head_mask = [None] * model.distilbert.num_hidden_layers

    transformer_output = model.distilbert.transformer([batch_embedding, attention_mask, head_mask], training=False)[0]
    pooled_output = transformer_output[:, 0]
    pooled_output = model.pre_classifier(pooled_output)
    logits = model.classifier(pooled_output)
    return logits
###

### We need some data to explain
for batch in valid_dataset.take(1):
    batch_input = batch[0]

batch_ids = batch_input['input_ids']
batch_embedding = embedding_model(batch_ids)

baseline_ids = np.zeros((1, 128), dtype=np.int64)
baseline_embedding = embedding_model(baseline_ids)
###

### We are finally ready to explain our model
explainer = EmbeddingExplainerTF(prediction_model)
attributions = explainer.attributions(inputs=batch_embedding,
                                      baseline=baseline_embedding,
                                      batch_size=32,
                                      num_samples=256,
                                      use_expectation=False,
                                      output_indices=1)
###

### For interactions, the hessian is rather large so we use a very small batch size
interactions = explainer.interactions(inputs=batch_embedding,
                                      baseline=baseline_embedding,
                                      batch_size=1,
                                      num_samples=256,
                                      use_expectation=False,
                                      output_indices=1)
###

We can plot the learned attributions and interactions as follows. First we plot the attributions:

### First we need to decode the tokens from the batch ids.
batch_sentences = ...
### Doing so will depend on how you tokenized your model!

text_plot(batch_sentences[0],
          attributions[0],
          include_legend=True)

Showing feature attributions in text

Then we plot the interactions:

bar_interaction_plot(interactions[0],
                     batch_sentences[0],
                     top_k=5)

Showing feature interactions in text

If you would rather plot the full matrix of attributions rather than the top interactions in a bar plot, our package also supports this. First we show the attributions:

text_plot(batch_sentences[1],
          attributions[1],
          include_legend=True)

Showing additional attributions

And then we show the full interaction matrix. Here we've zeroed out the diagonals so you can better see the off-diagonal terms.

matrix_interaction_plot(interaction_list[1],
                        token_list[1])

Showing the full matrix of feature interactions

This example - interpreting DistilBERT - was also featured in our paper. You can examine the setup more here. For more examples, see the examples directory in this repository.

More Repositories

1

treeexplainer-study

Code and documentation for experiments in the TreeExplainer paper
HTML
178
star
2

attributionpriors

Tools for training explainable models using attribution priors.
Jupyter Notebook
121
star
3

contrastiveVI

Python
45
star
4

MONET

Transparent medical image AI via an imageโ€“text foundation model grounded in medical literature
Python
41
star
5

cxr_covid

Code for paper "AI for radiographic COVID-19 detection selects shortcuts over signal"
Jupyter Notebook
29
star
6

vit-shapley

Python
24
star
7

DeepProfile

DeepProfile framework, which learns a variational autoencoder (VAE) network from a vast quantity of publicly available gene expression samples and uses the learned network to encode a low-dimensional representation (LDR) for predicting complex phenotypes
Python
19
star
8

IMPACT

This repository provides the dataset and code of Interpretable machine learning prediction of all-cause mortality.
Jupyter Notebook
18
star
9

cxr_adv

Repository for the paper "An Adversarial Approach for the Robust Classification of Pneumonia from Chest Radiographs"
Python
18
star
10

AIControl.jl

Implementation of the AIControl paper in Julia 1.0
Julia
17
star
11

shapley_algorithms

Jupyter Notebook
15
star
12

DeepSHAP

Experiments for DeepSHAP paper.
Jupyter Notebook
14
star
13

MD-AD

Jupyter Notebook
10
star
14

PAUSE

Code for paper "Principled feature attribution for unsupervised gene expression analysis"
Jupyter Notebook
10
star
15

coai

Cost-Aware AI
Jupyter Notebook
8
star
16

express

Code for paper "Uncovering expression signatures of synergistic drug response in acute myeloid leukemia by explaining AI ensembles"
Jupyter Notebook
6
star
17

MM-cVAE

Implementation of the moment matching contrastive VAE
Python
6
star
18

HD-AE

Code for paper "Transferable representations of single-cell transcriptomic data"
Jupyter Notebook
6
star
19

derm_audit

Python
6
star
20

deep_attribution_priors

Reference implementation of "Learning Deep Attribution Priors Based On Prior Knowledge"
Jupyter Notebook
5
star
21

deepprofile-study

Code for paper "A deep profile of gene expression across 18 human cancers"
Jupyter Notebook
5
star
22

PHASE

Repository for the PHASE (PHysiologicAl Signal Embeddings) project.
Jupyter Notebook
4
star
23

DIME

Code for the paper "Estimating Conditional Mutual Information for Dynamic Feature Selection"
Python
4
star
24

Useful_Datasets_CSE_527

Jupyter Notebook
4
star
25

ENABLAge

Jupyter Notebook
4
star
26

methylVI

A deep generative model for single-cell methylation data
Python
4
star
27

cl-explainability

Contrastive Corpus Attribution (COCOA)
Python
2
star
28

contrastiveVI-reproducibility

Repository for reproducing contrastiveVI results.
Jupyter Notebook
2
star