A practical guide towards explainability and bias evaluation in machine learning

This repo contains the full Jupyter Notebook and code for the Python talk on machine learning explainabilty and algorithmic bias.

YouTube Video of Talk

This Video of talk presented at PyData London 2019 which provides an overview on the motivations for machine learning explainability as well as techniques to introduce explainability and mitigate undesired biases.

Live Slides (Reveal.JS)

The presentation was performed using the RISE plugin to convert the Jupyter notebook into a reveal.js presentation. The reveal.js presentation is hosted live in this repo under the index.html page.

Examples to try it yourself

Code examples to try it yourself:

Open Source Tools used

This example uses the following open source libraries:

XAI - We use XAI to showcase data analysis techniques
Alibi - We use Alibi to dive into black box model evaluation techniques
Seldon Core - We use seldon core to deploy and serve ML models and ML explainers

Summarised version in markdown format

In this next section below you can find the sumarised version of Jupyter notebook / presentation slides in Markdown format.

This section below contains the code blocks that summarise the 3 steps proposed in the presentation proposed for explainability: 1) Data analysis, 2) Model evaluation and 3) Production monitoring.

1) Data Analysis

Points to cover

1.1) Data imbalances

1.2) Upsampling / downsampling

1.3) Correlations

1.4) Train / test set

1.5) Further techniques

XAI - eXplainable AI

We'll be using the XAI library which is a set of tools to explain machine learning data

https://github.com/EthicalML/XAI

Let's get the new training dataset

X, y, X_train, X_valid, y_train, y_valid, X_display, y_display, df, df_display \
    = get_dataset_2()
df_display.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	age	workclass	education	education-num	marital-status	occupation	relationship	ethnicity	gender	capital-gain	hours-per-week	native-country	loan
0	39	State-gov	Bachelors	13	Never-married	Adm-clerical	Not-in-family	White	Male	2174	40	United-States	False
1	50	Self-emp-not-inc	Bachelors	13	Married-civ-spouse	Exec-managerial	Husband	White	Male	0	13	United-States	False
2	38	Private	HS-grad	9	Divorced	Handlers-cleaners	Not-in-family	White	Male	0	40	United-States	False
3	53	Private	11th	7	Married-civ-spouse	Handlers-cleaners	Husband	Black	Male	0	40	United-States	False
4	28	Private	Bachelors	13	Married-civ-spouse	Prof-specialty	Wife	Black	Female	0	40	Cuba	False

1.1) Data imbalances

We can visualise the imbalances by looking at the number of examples for each class

im = xai.imbalance_plot(df_display, "gender", threshold=0.55, categorical_cols=["gender"])

We can evaluate imbalances by the product of multiple categories

im = xai.imbalance_plot(df_display, "gender", "loan" , categorical_cols=["loan", "gender"])

For numeric datasets we can break it down in bins

im = xai.imbalance_plot(df_display, "age" , bins=10)

1.2) Upsampling / Downsampling

im = xai.balance(df_display, "ethnicity", "loan", categorical_cols=["ethnicity", "loan"],
                upsample=0.5, downsample=0.5, bins=5)

1.3 Correlations hidden in data

We can identify potential correlations across variables through a dendogram visualiation

corr = xai.correlations(df_display, include_categorical=True)

1.4) Balanced train/testing sets

X_train_balanced, y_train_balanced, X_valid_balanced, y_valid_balanced, train_idx, test_idx = \
    xai.balanced_train_test_split(
            X, y, "gender", 
            min_per_group=300,
            max_per_group=300,
            categorical_cols=["gender", "loan"])

X_valid_balanced["loan"] = y_valid_balanced
im = xai.imbalance_plot(X_valid_balanced, "gender", "loan", categorical_cols=["gender", "loan"])

1.5 Shoutout to other tools and techniques

https://github.com/EthicalML/awesome-production-machine-learning#industrial-strength-visualisation-libraries

2) Model evaluation

Points to cover

2.1) Standard model evaluation metrics

2.2) Global model explanation techniques

2.3) Black box local model explanation techniques

2.4) Other libraries available

Alibi - Black Box Model Explanations

A set of proven scientific techniques to explain ML models as black boxes

https://github.com/SeldonIO/Alibi

Model Evaluation Metrics: White / Black Box

Model Evaluation Metrics: Global vs Local

2.1) Standard model evaluation metrics

# Let's start by building our model with our newly balanced dataset
model = build_model(X)
model.fit(f_in(X_train), y_train, epochs=20, batch_size=512, shuffle=True, validation_data=(f_in(X_valid), y_valid), callbacks=[PlotLossesKeras()], verbose=0, validation_split=0.05,)
probabilities = model.predict(f_in(X_valid))
pred = f_out(probabilities)

Log-loss (cost function):
training   (min:    0.311, max:    0.581, cur:    0.311)
validation (min:    0.312, max:    0.464, cur:    0.312)

Accuracy:
training   (min:    0.724, max:    0.856, cur:    0.856)
validation (min:    0.808, max:    0.857, cur:    0.857)

xai.confusion_matrix_plot(y_valid, pred)

im = xai.roc_plot(y_valid, pred)

im = xai.roc_plot(y_valid, pred, df=X_valid, cross_cols=["gender"], categorical_cols=["gender"])

im = xai.metrics_plot(y_valid, pred)

im = xai.metrics_plot(y_valid, pred, df=X_valid, cross_cols=["gender"], categorical_cols="gender")

2.2) Global black box model evalutaion metrics

imp = xai.feature_importance(X_valid, y_valid, lambda x, y: model.evaluate(f_in(x), y, verbose=0)[1], repeat=1)

2.3) Local black box model evaluation metrics

Overview of methods

Anchors

Consists of if-then rules, called the anchors, which sufficiently guarantee the explanation locally and try to maximize the area for which the explanation holds. (ArXiv: Anchors: High-Precision Model-Agnostic Explanations)

from alibi.explainers import AnchorTabular

explainer = AnchorTabular(
    loan_model_alibi.predict, 
    feature_names_alibi, 
    categorical_names=category_map_alibi)

explainer.fit(
    X_train_alibi, 
    disc_perc=[25, 50, 75])

print("Explainer built")

Explainer built

X_test_alibi[:1]

array([[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]])

explanation = explainer.explain(X_test_alibi[:1], threshold=0.95)

print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('Coverage: %.2f' % explanation['coverage'])

Anchor: Marital Status = Separated AND Sex = Female AND Capital Gain <= 0.00
Precision: 0.97
Coverage: 0.10

Counterfactual Explanations

The counterfactual explanation of an outcome or a situation Y takes the form “If X had not occured, Y would not have occured”

1.5 Shoutout to other tools and techniques

https://github.com/EthicalML/awesome-production-machine-learning#explaining-black-box-models-and-datasets

3) Production Monitoring

Key points to cover

Design patterns for explainers

Live demo of explainers

Leveraging humans for explainers

Seldon Core - Production ML in K8s

A language agnostic ML serving & monitoring framework in Kubernetes

https://github.com/SeldonIO/seldon-core

3.1) Design patterns for explainers

Setup Seldon in your kubernetes cluster

%%bash
kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
helm init
kubectl rollout status deploy/tiller-deploy -n kube-system
helm install seldon-core-operator --name seldon-core-operator --repo https://storage.googleapis.com/seldon-charts
helm install seldon-core-analytics --name seldon-core-analytics --repo https://storage.googleapis.com/seldon-charts
helm install stable/ambassador --name ambassador

from sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer

# feature transformation pipeline
ordinal_features = [x for x in range(len(alibi_feature_names)) if x not in list(alibi_category_map.keys())]
ordinal_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
                                      ('scaler', StandardScaler())])

categorical_features = list(alibi_category_map.keys())
categorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
                                          ('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(transformers=[('num', ordinal_transformer, ordinal_features),
                                               ('cat', categorical_transformer, categorical_features)])

preprocessor.fit(alibi_data)

from sklearn.ensemble import RandomForestClassifier

np.random.seed(0)
clf = RandomForestClassifier(n_estimators=50)
clf.fit(preprocessor.transform(X_train_alibi), y_train_alibi)

!mkdir -p pipeline/pipeline_steps/loanclassifier/

Save the model artefacts so we can deploy them

import dill

with open("pipeline/pipeline_steps/loanclassifier/preprocessor.dill", "wb") as prep_f:
    dill.dump(preprocessor, prep_f)
    
with open("pipeline/pipeline_steps/loanclassifier/model.dill", "wb") as model_f:
    dill.dump(clf, model_f)

Build a Model wrapper that uses the trained models through a predict function

%%writefile pipeline/pipeline_steps/loanclassifier/Model.py
import dill

class Model:
    def __init__(self, *args, **kwargs):
        
        with open("preprocessor.dill", "rb") as prep_f:
            self.preprocessor = dill.load(prep_f)
        with open("model.dill", "rb") as model_f:
            self.clf = dill.load(model_f)
        
    def predict(self, X, feature_names=[]):
        X_prep = self.preprocessor.transform(X)
        proba = self.clf.predict_proba(X_prep)
        return proba

Add the dependencies for the wrapper to work

%%writefile pipeline/pipeline_steps/loanclassifier/requirements.txt
scikit-learn==0.20.1
dill==0.2.9
scikit-image==0.15.0
scikit-learn==0.20.1
scipy==1.1.0
numpy==1.15.4

!mkdir pipeline/pipeline_steps/loanclassifier/.s2i

%%writefile pipeline/pipeline_steps/loanclassifier/.s2i/environment
MODEL_NAME=Model
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE=0

Use the source2image command to containerize code

!s2i build pipeline/pipeline_steps/loanclassifier seldonio/seldon-core-s2i-python3:0.8 loanclassifier:0.1

Define the graph of your pipeline with individual models

%%writefile pipeline/pipeline_steps/loanclassifier/loanclassifiermodel.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  labels:
    app: seldon
  name: loanclassifier
spec:
  name: loanclassifier
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: loanclassifier:0.1
          name: model
    graph:
      children: []
      name: model
      type: MODEL
      endpoint:
        type: REST
    name: loanclassifier
    replicas: 1

Deploy your model!

!kubectl apply -f pipeline/pipeline_steps/loanclassifier/loanclassifiermodel.yaml

Now we can send data through the REST API

X_test_alibi[:1]

array([[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]])

%%bash
curl -X POST -H 'Content-Type: application/json' \
    -d "{'data': {'names': ['text'], 'ndarray': [[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]]}}" \
    http://localhost:80/seldon/default/loanclassifier/api/v0.1/predictions

{
  "meta": {
    "puid": "96cmdkc4k1c6oassvpnpasqbgf",
    "tags": {
    },
    "routing": {
    },
    "requestPath": {
      "model": "loanclassifier:0.1"
    },
    "metrics": []
  },
  "data": {
    "names": ["t:0", "t:1"],
    "ndarray": [[0.86, 0.14]]
  }
}

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   356  100   264  100    92  11000   3833 --:--:-- --:--:-- --:--:-- 15478

We can also reach it with the Python Client

from seldon_core.seldon_client import SeldonClient

batch = X_test_alibi[:1]

sc = SeldonClient(
    gateway="ambassador", 
    gateway_endpoint="localhost:80",
    deployment_name="loanclassifier",
    payload_type="ndarray",
    namespace="default",
    transport="rest")

client_prediction = sc.predict(data=batch)

print(client_prediction.response)

meta {
  puid: "hv4dnmr8m3ckgrhtnc48rs7mjg"
  requestPath {
    key: "model"
    value: "loanclassifier:0.1"
  }
}
data {
  names: "t:0"
  names: "t:1"
  ndarray {
    values {
      list_value {
        values {
          number_value: 0.86
        }
        values {
          number_value: 0.14
        }
      }
    }
  }
}

Now we can create an explainer for our model

from alibi.explainers import AnchorTabular

predict_fn = lambda x: clf.predict(preprocessor.transform(x))
explainer = AnchorTabular(predict_fn, alibi_feature_names, categorical_names=alibi_category_map)
explainer.fit(X_train_alibi, disc_perc=[25, 50, 75])

explanation = explainer.explain(X_test_alibi[0], threshold=0.95)

print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('Coverage: %.2f' % explanation['coverage'])

Anchor: Marital Status = Separated AND Sex = Female AND Capital Gain <= 0.00
Precision: 0.97
Coverage: 0.10

def predict_remote_fn(X):
    from seldon_core.seldon_client import SeldonClient
    from seldon_core.utils import get_data_from_proto
    
    kwargs = {
        "gateway": "ambassador", 
        "deployment_name": "loanclassifier",
        "payload_type": "ndarray",
        "namespace": "default",
        "transport": "rest"
    }
    
    try:
        kwargs["gateway_endpoint"] = "localhost:80"
        sc = SeldonClient(**kwargs)
        prediction = sc.predict(data=X)
    except:
        # If we are inside the container, we need to reach the ambassador service directly
        kwargs["gateway_endpoint"] = "ambassador:80"
        sc = SeldonClient(**kwargs)
        prediction = sc.predict(data=X)
    
    y = get_data_from_proto(prediction.response)
    return y

But now we can use the remote model we have in production

# Summary of the predict_remote_fn
def predict_remote_fn(X):
    ....
    sc = SeldonClient(...)
    prediction = sc.predict(data=X)
    y = get_data_from_proto(prediction.response)
    return y

And train our explainer to use the remote function

from seldon_core.utils import get_data_from_proto

explainer = AnchorTabular(predict_remote_fn, alibi_feature_names, categorical_names=alibi_category_map)
explainer.fit(X_train_alibi, disc_perc=[25, 50, 75])

explanation = explainer.explain(X_test_alibi[idx], threshold=0.95)

print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('Coverage: %.2f' % explanation['coverage'])

Anchor: Marital Status = Separated AND Sex = Female
Precision: 0.97
Coverage: 0.11

To containerise our explainer, save the trained binary

import dill

with open("pipeline/pipeline_steps/loanclassifier-explainer/explainer.dill", "wb") as x_f:
    dill.dump(explainer, x_f)

Expose it through a wrapper

%%writefile pipeline/pipeline_steps/loanclassifier-explainer/Explainer.py
import dill
import json
import numpy as np

class Explainer:
    def __init__(self, *args, **kwargs):
        
        with open("explainer.dill", "rb") as x_f:
            self.explainer = dill.load(x_f)
        
    def predict(self, X, feature_names=[]):
        print("Received: " + str(X))
        explanation = self.explainer.explain(X)
        print("Predicted: " + str(explanation))
        return json.dumps(explanation, cls=NumpyEncoder)

    
    
    
class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (
        np.int_, np.intc, np.intp, np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, np.uint32, np.uint64)):
            return int(obj)
        elif isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):
            return float(obj)
        elif isinstance(obj, (np.ndarray,)):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

Add config files to build image with script

!s2i build pipeline/pipeline_steps/loanclassifier-explainer seldonio/seldon-core-s2i-python3:0.8 loanclassifier-explainer:0.1

!mkdir -p pipeline/pipeline_steps/loanclassifier-explainer

%%writefile pipeline/pipeline_steps/loanclassifier-explainer/loanclassifiermodel-explainer.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  labels:
    app: seldon
  name: loanclassifier-explainer
spec:
  name: loanclassifier-explainer
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: loanclassifier-explainer:0.1
          name: model-explainer
    graph:
      children: []
      name: model-explainer
      type: MODEL
      endpoint:
        type: REST
    name: loanclassifier-explainer
    replicas: 1

Deploy your remote explainer

!kubectl apply -f pipeline/pipeline_steps/loanclassifier-explainer/loanclassifiermodel-explainer.yaml

Now we can request explanations throught the REST API

%%bash
curl -X POST -H 'Content-Type: application/json' \
    -d "{'data': {'names': ['text'], 'ndarray': [[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60, 9]] }}" \
    http://localhost:80/seldon/default/loanclassifier-explainer/api/v0.1/predictions

{
  "meta": {
    "puid": "ohbll5bcpu9gg7jjj1unll4155",
    "tags": {
    },
    "routing": {
    },
    "requestPath": {
      "model-explainer": "loanclassifier-explainer:0.1"
    },
    "metrics": []
  },
  "strData": "{\"names\": [\"Marital Status = Separated\", \"Sex = Female\"], \"precision\": 0.9629629629629629, \"coverage\": 0.1078, \"raw\": {\"feature\": [3, 7], \"mean\": [0.9002808988764045, 0.9629629629629629], \"precision\": [0.9002808988764045, 0.9629629629629629], \"coverage\": [0.1821, 0.1078], \"examples\": [{\"covered\": [[46, 4, 4, 2, 2, 1, 4, 1, 0, 0, 45, 9], [24, 4, 1, 2, 6, 3, 2, 1, 0, 0, 40, 9], [39, 4, 4, 2, 4, 1, 4, 1, 4650, 0, 44, 9], [40, 4, 0, 2, 5, 4, 4, 0, 0, 0, 32, 9], [39, 4, 1, 2, 8, 0, 4, 1, 3103, 0, 50, 9], [45, 4, 1, 2, 6, 5, 4, 0, 0, 0, 42, 9], [41, 4, 1, 2, 5, 1, 4, 1, 0, 0, 40, 9], [40, 4, 4, 2, 2, 0, 4, 1, 0, 0, 40, 9], [58, 4, 3, 2, 2, 2, 4, 0, 0, 0, 45, 5], [23, 4, 1, 2, 5, 1, 4, 1, 0, 0, 50, 9]], \"covered_true\": [[33, 4, 4, 2, 2, 0, 4, 1, 0, 0, 40, 9], [70, 0, 4, 2, 0, 0, 4, 1, 0, 0, 10, 9], [66, 0, 4, 2, 0, 0, 4, 1, 0, 0, 30, 9], [37, 1, 1, 2, 8, 2, 4, 0, 0, 0, 50, 9], [32, 4, 5, 2, 6, 5, 4, 0, 0, 0, 45, 9], [24, 4, 4, 2, 7, 1, 4, 1, 0, 0, 40, 9], [46, 7, 6, 2, 5, 1, 4, 0, 0, 1564, 55, 9], [28, 4, 4, 2, 2, 3, 4, 0, 0, 0, 40, 9], [28, 4, 4, 2, 2, 0, 4, 1, 3411, 0, 40, 9], [45, 4, 0, 2, 2, 0, 4, 1, 0, 0, 40, 9]], \"covered_false\": [[51, 4, 6, 2, 5, 1, 4, 0, 0, 2559, 50, 9], [35, 4, 1, 2, 5, 0, 4, 1, 0, 0, 48, 9], [48, 4, 5, 2, 5, 0, 4, 1, 0, 0, 40, 9], [41, 4, 5, 2, 8, 0, 4, 1, 0, 1977, 65, 9], [51, 6, 5, 2, 8, 4, 4, 1, 25236, 0, 50, 9], [46, 4, 4, 2, 2, 0, 4, 1, 0, 0, 75, 9], [52, 6, 1, 2, 1, 5, 4, 0, 99999, 0, 30, 9], [55, 2, 5, 2, 8, 0, 4, 1, 0, 0, 55, 9], [46, 4, 3, 2, 5, 4, 0, 1, 0, 0, 40, 9], [39, 4, 6, 2, 8, 5, 4, 0, 15024, 0, 47, 9]], \"uncovered_true\": [], \"uncovered_false\": []}, {\"covered\": [[52, 4, 4, 2, 1, 4, 4, 0, 0, 1741, 38, 9], [38, 4, 4, 2, 1, 3, 4, 0, 0, 0, 40, 9], [53, 4, 5, 2, 5, 4, 4, 0, 0, 1876, 38, 9], [54, 4, 4, 2, 8, 1, 4, 0, 0, 0, 43, 9], [43, 2, 1, 2, 5, 4, 4, 0, 0, 625, 40, 9], [27, 1, 4, 2, 8, 4, 2, 0, 0, 0, 40, 9], [47, 4, 4, 2, 1, 1, 4, 0, 0, 0, 35, 9], [54, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 3], [43, 4, 4, 2, 8, 1, 4, 0, 0, 0, 50, 9], [53, 4, 4, 2, 5, 1, 4, 0, 0, 0, 40, 9]], \"covered_true\": [[54, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 3], [41, 4, 4, 2, 1, 4, 4, 0, 0, 0, 40, 9], [58, 4, 4, 2, 1, 1, 4, 0, 0, 0, 40, 9], [36, 4, 4, 2, 6, 1, 4, 0, 3325, 0, 45, 9], [29, 4, 0, 2, 1, 1, 4, 0, 0, 0, 40, 9], [35, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 9], [39, 4, 4, 2, 7, 1, 4, 0, 0, 0, 40, 8], [42, 4, 4, 2, 1, 4, 2, 0, 0, 0, 41, 9], [37, 7, 4, 2, 7, 3, 4, 0, 0, 0, 40, 9], [47, 4, 4, 2, 1, 1, 4, 0, 0, 0, 38, 9]], \"covered_false\": [[55, 5, 4, 2, 6, 4, 4, 0, 0, 0, 50, 9], [33, 7, 2, 2, 5, 5, 4, 0, 0, 0, 48, 9], [39, 4, 6, 2, 8, 5, 4, 0, 15024, 0, 47, 9], [48, 4, 5, 2, 8, 4, 4, 0, 0, 0, 40, 9], [41, 4, 1, 2, 5, 1, 4, 0, 0, 0, 50, 9], [42, 1, 5, 2, 8, 1, 4, 0, 14084, 0, 60, 9], [51, 4, 6, 2, 5, 1, 4, 0, 0, 2559, 50, 9], [52, 6, 1, 2, 1, 5, 4, 0, 99999, 0, 30, 9], [39, 7, 2, 2, 5, 1, 4, 0, 0, 0, 40, 9]], \"uncovered_true\": [], \"uncovered_false\": []}], \"all_precision\": 0, \"num_preds\": 1000101, \"names\": [\"Marital Status = Separated\", \"Sex = Female\"], \"instance\": [[52.0, 4.0, 0.0, 2.0, 8.0, 4.0, 2.0, 0.0, 0.0, 0.0, 60.0, 9.0]], \"prediction\": 0}}"
}

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3464  100  3372  100    92   3318     90  0:00:01  0:00:01 --:--:--  3409

Now we have an explainer deployed!

Visualise metrics and explanations

Leveraging Humans for Explanations

Revisiting our workflow

Explainability and Bias Evaluation

Alejandro Saucedo

Chief Scientist, The Institute for Ethical AI & Machine Learning Director of ML Engineering, Seldon Technologie Director of ML Engineering, Seldon Technologiess

github.com/ethicalml/explainability-and-bias

EthicalML/explainability-and-bias

EthicalML

Reviews

Repository Details

A practical guide towards explainability and bias evaluation in machine learning

YouTube Video of Talk

Live Slides (Reveal.JS)

Examples to try it yourself

Open Source Tools used

Summarised version in markdown format

Contents

1) Data Analysis

Points to cover

XAI - eXplainable AI

https://github.com/EthicalML/XAI

Let's get the new training dataset

1.1) Data imbalances

We can visualise the imbalances by looking at the number of examples for each class

We can evaluate imbalances by the product of multiple categories

For numeric datasets we can break it down in bins

1.2) Upsampling / Downsampling

1.3 Correlations hidden in data

We can identify potential correlations across variables through a dendogram visualiation

1.4) Balanced train/testing sets

1.5 Shoutout to other tools and techniques

2) Model evaluation

Points to cover

Alibi - Black Box Model Explanations

A set of proven scientific techniques to explain ML models as black boxes

https://github.com/SeldonIO/Alibi

Model Evaluation Metrics: White / Black Box

Model Evaluation Metrics: Global vs Local

2.1) Standard model evaluation metrics

2.2) Global black box model evalutaion metrics

2.3) Local black box model evaluation metrics

Overview of methods

Anchors

Consists of if-then rules, called the anchors, which sufficiently guarantee the explanation locally and try to maximize the area for which the explanation holds. (ArXiv: Anchors: High-Precision Model-Agnostic Explanations)

Counterfactual Explanations

The counterfactual explanation of an outcome or a situation Y takes the form “If X had not occured, Y would not have occured”

1.5 Shoutout to other tools and techniques

3) Production Monitoring

Key points to cover

Seldon Core - Production ML in K8s

A language agnostic ML serving & monitoring framework in Kubernetes

https://github.com/SeldonIO/seldon-core

3.1) Design patterns for explainers

Setup Seldon in your kubernetes cluster

Save the model artefacts so we can deploy them

Build a Model wrapper that uses the trained models through a predict function

Add the dependencies for the wrapper to work

Use the source2image command to containerize code

Define the graph of your pipeline with individual models

Deploy your model!

Now we can send data through the REST API

We can also reach it with the Python Client

Now we can create an explainer for our model

But now we can use the remote model we have in production

And train our explainer to use the remote function

To containerise our explainer, save the trained binary

Expose it through a wrapper

Add config files to build image with script

Deploy your remote explainer

Now we can request explanations throught the REST API

Now we have an explainer deployed!

Visualise metrics and explanations

Leveraging Humans for Explanations

Revisiting our workflow

Explainability and Bias Evaluation

Alejandro Saucedo

More Repositories