• Stars
    star
    683
  • Rank 66,158 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Bias Auditing & Fair ML Toolkit

The Bias and Fairness Audit Toolkit

Aequitas is an open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive tools.

Visit the Aequitas project website

Try out the Aequitas web application

Try out our interact colab notebook using the COMPAS dataset.

Documentation

You can find the toolkit documentation here.

For usage examples of the python library, see our demo notebook from the KDD 2020 hands-on tutorial. Alternatively, have a look to COMPAS notebook using Aequitas on the ProPublica COMPAS Recidivism Risk Assessment dataset.

Installation

Aequitas is compatible with: Python 3.6+

Install Aequitas using pip:

pip install aequitas

If pip fails, try installing master from source:

git clone https://github.com/dssg/aequitas.git
cd aequitas
python setup.py install

(Note: be mindful of the python version you use to run setup.py)

You may then import the aequitas module from Python:

import aequitas

...or execute the auditor from the command line:

aequitas-report

...or launch the Web front-end from the command line (localhost):

python -m serve

Containerization

To build a Docker container of Aequitas:

docker build -t aequitas .

...or simply via manage:

manage container build

The Docker image's container defaults to launching the development Web server, though this can be overridden via the Docker "command" and/or "entrypoint".

To run such a container, supporting the Web server, on-the-fly:

docker run -p 5000:5000 -e "HOST=0.0.0.0" aequitas

...or, manage a development container via manage:

manage container [create|start|stop]

To contact the team, please email us at [aequitas at uchicago dot edu]

Aequitas Group Metrics

Below are descriptions of the absolute bias metrics calculated by Aequitas.

Metric Formula Description
Predicted Positive The number of entities within a group where the decision is positive, i.e.,
Total Predictive Positive The total number of entities predicted positive across groups defined by
Predicted Negative The number of entities within a group which decision is negative, i.e.,
Predicted Prevalence The fraction of entities within a group which were predicted as positive.
Predicted Positive Rate The fraction of the entities predicted as positive that belong to a certain group.
False Positive The number of entities of the group with and
False Negative The number of entities of the group with and
True Positive The number of entities of the group with and
True Negative The number of entities of the group with and
False Discovery Rate The fraction of false positives of a group within the predicted positive of the group.
False Omission Rate The fraction of false negatives of a group within the predicted negative of the group.
False Positive Rate The fraction of false positives of a group within the labeled negative of the group.
False Negative Rate The fraction of false negatives of a group within the labeled positives of the group.

Each bias disparity for a given group is calculated as follows:

30 Seconds to Aequitas

Python API

Detailed instructions are here.

To get started, preprocess your input data. Input data has slightly different requirements depending on whether you are using Aequitas via the webapp, CLI or Python package. See general input requirements and specific requirements for the web app, CLI, and Python API in the section immediately below.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string,' so don't forget to recast any 'categorical' type columns!

    from aequitas.preprocessing import preprocess_input_df
    
    # double-check that categorical columns are of type 'string'
    df['categorical_column_name'] = df['categorical_column_name'].astype(str)
    
    df, _ = preprocess_input_df(*input_data*)

The Aequitas Group() class creates a crosstab of your preprocessed data, calculating absolute group metrics from score and label value truth status (true/ false positives and true/ false negatives)

    from aequitas.group import Group
    
    g = Group()
    xtab, _ = g.get_crosstabs(df)

The Plot() class can visualize a single group metric with plot_group_metric(), or a list of bias metrics with plot_group_metric_all(). Suppose you are interested in False Positive Rate across groups. We can visualize this metric in Aequitas:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr')

There are some very small groups in this data set, for example 18 and 32 samples in the Native American and Asian population groups, respectively.

Aequitas includes an option to filter out groups under a minimum group size threshold, as very small group size may be a contributing factor in model error rates:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr', min_group_size=0.05)

The crosstab dataframe is augmented by every succeeding class with additional layers of information about biases, starting with bias disparities in the Bias() class. There are three get_disparity functions, one for each of the three ways to select a reference group. get_disparity_min_metric() and get_disparity_major_group() methods calculate a reference group automatically based on your data, while the user specifies reference groups for get_disparity_predefined_groups().

    from aequitas.bias import Bias
    
    b = Bias()
    bdf = b.get_disparity_predefined_groups(xtab, 
                        original_df=df, 
                        ref_groups_dict={'race':'Caucasian', 'sex':'Male', 'age_cat':'25 - 45'}, 
                        alpha=0.05, 
                        check_significance=False)

Learn more about reference group selection.

The Plot() class visualizes disparities as treemaps colored based on disparity relationship between a given group and the reference group with plot_disparity() or multiple with plot_disparity_all(). Saturation is determined by a given fairness threshold.

Let's look at False Positive Rate Disparity.

    fpr_disparity = aqp.plot_disparity(bdf, group_metric='fpr_disparity', 
                                       attribute_name='race')

Now you're ready to obtain metric parities with the Fairness() class:

    from aequitas.fairness import Fairness
    
    f = Fairness()
    fdf = f.get_group_value_fairness(bdf)

You now have parity determinations for your models that can be leveraged in model selection! If a specific bias metric for a group falls within a given percentage (based on the fairness threshold) of the reference group, the fairness determination is 'True.'

To determine whether group False Positive Rates fall within the "fair" range, use Plot() class fairness methods:

    fpr_fairness = aqp.plot_fairness_group(fdf, group_metric='fpr', title=True)

To quickly review False Positive Rate Disparity fairness determinations, we can use Plot() class fairness_disparity() methods:

    fpr_disparity_fairness = aqp.plot_fairness_disparity(fdf, group_metric='fpr', attribute_name='race')

Input Data

In general, input data is a single table with the following columns:

  • score
  • label_value (for error-based metrics only)
  • at least one attribute e.g. race, sex and age_cat (attribute categories defined by user)
score label_value race sex age income
0 1 African-American Female 27 18000
1 1 Caucasian Male 32

Back to 30 Seconds to Aequitas

Input data for Webapp

The webapp requires a single CSV with columns for a binary score, a binary label_value and an arbitrary number of attribute columns. Each row is associated with a single observation.

score

Aequitas webapp assumes the score column is a binary decision (0 or 1).

label_value

This is the ground truth value of a binary decision. The data again must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles and then create crosstabs with the newly defined categories.

Back to 30 Seconds to Aequitas

Input data for CLI

The CLI accepts CSV files and accommodates database calls defined in Configuration files.

score

By default, Aequitas CLI assumes the score column is a binary decision (0 or 1). Alternatively, the score column can contain the score (e.g. the output from a logistic regression applied to the data). In this case, the user sets a threshold to determine the binary decision. See configurations for more on thresholds.

label_value

As with the webapp, this is the ground truth value of a binary decision. The data must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group value. If continuous, Aequitas will first bin the data into quartiles.

model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Input data for Python API

Python input data can be handled identically to CLI by using preprocess_input_df(). Otherwise, you must discretize continuous attribute columns prior to passing the data to Group().get_crosstabs().

    from Aequitas.preprocessing import preprocess_input_df()
    # *input_data* matches CLI input data norms.
    df, _ = preprocess_input_df(*input_data*)

score

By default, Aequitas assumes the score column is a binary decision (0 or 1). If the score column contains a non-binary score (e.g. the output from a logistic regression applied to the data), the user sets a threshold to determine the binary decision. Thresholds are set in a dictionary passed to get_crosstabs() of format {'rank_abs':[300] , 'rank_pct':[1.0, 5.0, 10.0]}. See configurations for more on thresholds.

label_value

This is the ground truth value of a binary decision. The data must be binary (0 or 1).

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string'. This excludes the pandas 'categorical' data type, which is the default output of certain pandas discretizing functions. You can recast 'categorical' columns to strings:

   df['categorical_column_name'] = df['categorical_column_name'].astype(str)
model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Development

Provision your development environment via the shell script develop:

./develop

Common development tasks, such as deploying the webapp, may then be handled via manage:

manage --help

Citing Aequitas

If you use Aequitas in a scientific publication, we would appreciate citations to the following paper:

Pedro Saleiro, Benedict Kuester, Abby Stevens, Ari Anisfeld, Loren Hinkson, Jesse London, Rayid Ghani, Aequitas: A Bias and Fairness Audit Toolkit, arXiv preprint arXiv:1811.05577 (2018). (PDF)

   @article{2018aequitas,
     title={Aequitas: A Bias and Fairness Audit Toolkit},
     author={Saleiro, Pedro and Kuester, Benedict and Stevens, Abby and Anisfeld, Ari and Hinkson, Loren and London, Jesse and Ghani, Rayid}, journal={arXiv preprint arXiv:1811.05577}, year={2018}}

More Repositories

1

hitchhikers-guide

The Hitchhiker's Guide to Data Science for Social Good
Jupyter Notebook
1,000
star
2

triage

General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems
Jupyter Notebook
186
star
3

bikeshare

Statistical models and webapp for predicting when bikeshare stations will be empty or full.
JavaScript
114
star
4

mlforpublicpolicylab

Repo for ML for Public Policy Lab course at CMU
Jupyter Notebook
104
star
5

MLforPublicPolicy

Class resources for CAPP 30254 (Machine Learning for Public Policy)
Jupyter Notebook
104
star
6

data-science-101

Methods, tools, tips, and tricks for anyone interested in getting started doing data science for the social good.
Jupyter Notebook
95
star
7

energywise

An energy analytics tool to make commercial building more energy efficient
Python
77
star
8

fairness_tutorial

Hands-on tutorial on ML Fairness
Jupyter Notebook
69
star
9

student-early-warning

Using machine learning to predict high school dropouts
R
67
star
10

MLinPractice

Repository for ML in Practice Course at CMU (10-718)
Jupyter Notebook
56
star
11

tweedr

A machine learning API to analyze tweets during disasters.
JavaScript
54
star
12

police-eis

DSaPP police early intervention system: using machine learning to predict adverse incidents
Python
50
star
13

wikienergy

Git repo for Wiki Energy project
Jupyter Notebook
46
star
14

411-on-311

Exploratory analysis and predictive models of how Chicago's neighborhoods interact with the City's 311 service requests.
Python
44
star
15

pgdedupe

A simple command line interface to the datamade/dedupe library.
Jupyter Notebook
42
star
16

policy_diffusion

Tracing policy ideas from think tanks and lobbyists through state legislative bills
Python
42
star
17

givinggraph

An API tool to help understand the relationships between non-profits, for-profits, and the causes they support.
Python
28
star
18

ushine-learning

An API that uses machine learning to help the Ushahidi nonprofit do smarter crisis crowdsourcing.
Python
24
star
19

repo-scraper

Search for potential passwords/data leaks in a folder or git repo
Python
23
star
20

cta-sim

Big data simulation of Chicago's public transportation to improve transit planning and reduce bus crowding
CSS
23
star
21

census-communities-usa

Mapping and analyzing local business data from the Census Bureau.
Python
21
star
22

usal_echo_public

Automate process for view classification of the Apical 4 chamber, Apical 2 chamber and Parasternal long axis. Segmentation of the Apical 4 chamber and Apical 2 chamber. Calculate measurements of the Ejection Fraction of the heart to classify it as normal, abnoral or grayzone.
Python
21
star
23

syracuse_public

Python
21
star
24

jakarta_smart_city_traffic_safety_public

Identifying traffic-safety issues in CCTV footage
Jupyter Notebook
20
star
25

data-challenges

A repository of real-world data challenges faced by organizations used for project-based learning
19
star
26

match.edu

Predictive models to identify high-achieving high school students who are likely to undermatch - attend 2-year rather than 4-year colleges, or not go to college at all.
R
18
star
27

dsapp-reading-group

Proceedings of the Center for Data Scientists arguing (about) Public Papers
17
star
28

cta-otp

OpenTripPlanner tool and transit mobility maps for Chicago
Java
17
star
29

diogenes

Searching for an honest classifier
Python
17
star
30

ohio

Python I/O extras
Python
17
star
31

dirtyduck

A Guided Tour of Triage
CSS
16
star
32

eights

Data Science template with focus on prewritten workflows
Python
14
star
33

Random_Forest_Imputer

Automatic missing value imputation using random forests
Python
14
star
34

growth-curves

Statistical models of children's growth curves that predict which kids are at risk of obesity.
Python
14
star
35

argcmdr

Thin argparse wrapper for quick, clear and easy declaration of hierarchical console command interfaces
Python
13
star
36

learning

What fellows are learning about data problems and tools
Python
13
star
37

data-portal-treemap

Chicago Data Portal (data.cityofchicago.org) tree map
Python
13
star
38

DSaPP_RA_Project

This repository includes an exercise for aspiring DSaPP volunteers and research assistants to complete
12
star
39

dssg-training-workshop-2015

Main site for DSSG Training 2015
HTML
11
star
40

acs2pgsql

Download American Community Survey data and put it into a Postgres database
Shell
11
star
41

air_pollution_estimation

Jupyter Notebook
10
star
42

UPSG

A set of tools and conventions to help data scientists share code
Python
10
star
43

streetlights-crime

Statistical models to find whether Chicago street light outages are associated with increased crime
R
10
star
44

predicting_student_enrollment_public

Statistical models and analysis of student enrollment in Chicago Public Schools
R
9
star
45

dssg-manual

This repository contains the Eric & Wendy Schmidt Data Science for Social Good Fellowship Manual
9
star
46

cincinnati

DSaPP project with the City of Cincinnati. Building upon the DSSG15 project
Python
8
star
47

memphis-public

Public repository for the DSSG Memphis project
R
8
star
48

peeps-chili

Ethics with a side of chili. And peeps.
Jupyter Notebook
8
star
49

land-bank

Analytics tool to help the Cook County Land Bank acquire vacant and abandoned properties strategically.
JavaScript
8
star
50

johnson-county-ddj-public

Python
7
star
51

matching-tool

Integrating HMIS and criminal-justice data
Python
7
star
52

dssg2017-text_analysis

Text Analysis Tutorial for DSSG 2017 Conference
Jupyter Notebook
7
star
53

randomize_your_data

Randomize the order of each column to help check for leakage
Jupyter Notebook
7
star
54

timechop

generate time splits for temporal validation
Python
5
star
55

weather2pgsql

Download NOAA weather for a user-specified US state
Shell
5
star
56

tyra

Prediction model evaluation
JavaScript
4
star
57

barefoot-winnie-public

Recommending responses to law related queries - Built during DSSG in collaboration with Barefoot Law
Python
4
star
58

hylas

Webapp for visualizing ML'd data
JavaScript
4
star
59

innovation-ecosystems

Understanding city innovation hotspots using the Census CitySDK
CSS
4
star
60

healthleads-public

The public repo for the 2014 DSSG Health Leads project
Python
4
star
61

project_template

A template for a sample DSSG project.
Shell
4
star
62

solveforgood-wri

Jupyter Notebook
4
star
63

babies-public

This is the publicly available version of the babies repo, containing code used during our project with the Illinois Department of Human Services to predict and reduce adverse births in Illinois.
Python
4
star
64

stupid-csv-tricks

Code for doing slightly atypical things with CSVs
Python
4
star
65

catwalk

Training, testing, and evaluating machine learning classifier models
Python
3
star
66

lorax

Speaks for the trees by providing individual feature importances from random forests.
Python
3
star
67

hiv-retention-public

Jupyter Notebook
3
star
68

homelessness-public

Python
3
star
69

marketplace

Code for the Solve for Good platform run by the DSSG Foundation
Python
3
star
70

rws_accident_prediction_public

rws_accident_prediction
Jupyter Notebook
3
star
71

cincinnati2015-public

Predicting blight in Cincinnati
Python
3
star
72

mexico-public

Public facing Mexico repository
R
3
star
73

audition

Choosing the best classifier models
Python
3
star
74

dssg-public-hmda

R
3
star
75

install-cli

Bash library for guided installation & bootstrapping
Shell
3
star
76

sklearn_tutorial

Short tutorial on some pipeline issues
Jupyter Notebook
3
star
77

machine_learning_legislation

Automatically identify earmarks in congressional spending bills
OpenEdge ABL
3
star
78

acdhs_housing_public

Python
3
star
79

nfp

Impact evaluation of the Nurse-Family Partnership nonprofit
R
3
star
80

MS2Postgres

A tool to move data from SQL Server to PostgreSQL in an environment with limited harddrive space.
Python
3
star
81

EDF

Analysis of energy efficiency loan data for the Environmental Defense Fund.
3
star
82

check-for-secrets

Discovering Secrets analysts Possibly Pushed
Shell
2
star
83

el-salvador-mined-public

Reducing Early School Dropout Rates in El Salvador
Jupyter Notebook
2
star
84

after-hours

R
2
star
85

education-highschool-public

DSSG 2015 project focused on using data science methods to help partner public school districts improve their respective high school graduation rates and outcomes.
HTML
2
star
86

passenv

Shell command like env to run a program in an environment modified by values read from standard input
Python
2
star
87

chile-dt-public

Improving Workplace Safety through Proactive Inspections in Chile
R
2
star
88

cincinnati_ems_public

Jupyter Notebook
2
star
89

tuscany-tourism-public

Data-Driven Planning for Sustainable Tourism in Tuscany
HTML
2
star
90

architect

Plan, design and build train and test matrices
Python
2
star
91

panopticon

The command center at the DSSG office. http://en.wikipedia.org/wiki/Panopticon
Ruby
2
star
92

pakistan_ihhn_public

Public Repository for the DSSG 2022 Pakistan IHHN Project
Python
2
star
93

data-sci-fellows

The sexy landing page that will make everyone want to apply for the fellowship.
JavaScript
2
star
94

donors-choose

Jupyter Notebook
2
star
95

obscuritext

Transform text to be unreadable but still somewhat useful
Jupyter Notebook
2
star
96

dojo_mh_public

Public Repository for DSSG 2022 Douglas and Johnson Counties, KS Mental Health Project
Jupyter Notebook
2
star
97

baltimore_roofs_public

Public Repository for DSSG 2022 Baltimore Roofs Project
Python
2
star
98

sanergy-public

Python
2
star
99

signalled-timeout

Timeout library for generic interruption of main thread by an exception after a configurable duration.
Python
2
star
100

appy-reviews

A "smart" Web application for reviewing DSSG program application submissions
Python
2
star