• Stars
    star
    191
  • Rank 202,877 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created almost 3 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

AutoML for causal inference.

CausalTune: A library for automated Causal Inference model estimation and selection

CausalTune is a library for automated tuning and selection for causal estimators.

Its estimators are taken from EconML augmented by a couple of extra models (currently Transformed Outcome and a dummy model to be used as a baseline), all called in a uniform fashion via a DoWhy wrapper.

Our contribution is enabling automatic estimator tuning and selection by out-of-sample scoring of causal estimators, notably using the energy score. We use FLAML for hyperparameter optimisation.

We perform automated hyperparameter tuning of first stage models (for the treatment and outcome models) as well as hyperparameter tuning and model selection for the second stage model (causal estimator).

The estimators provide not only per-row treatment impact estimates, but also confidence intervals for these, using builtin EconML functionality for that where it is available and bootstrapping where it is not (see example notebook).

Just like DoWhy and EconML, we assume that the causal graph provided by the user accurately describes the data-generating process. So for example, we assume that for CATE estimation, the list of backdoor variables under the graph/confounding variables provided by the user do reflect all sources of confounding between the treatment and the outcome. See here for a detailed explanation of causal graphs that are supported by CausalTune.

The validation methods in CausalTune cannot catch such violations and therefore this is an important assumption.

We also implement the ERUPT calculation (also known as policy value), allowing after an (even partially) randomized test to estimate what the impact of other treatment assignment policies would have been. This can also be used as an alternative out-of-sample score, though energy score performed better in our synthetic data experiments.

Table of Contents

What can this do for you?

The automated search over the many powerful models from EconML and elsewhere allows you to easily do the following

1. Supercharge A/B tests by getting impact by customer, instead of just an average

By enriching the results of a regular A/B/N test with customer features, and running CausalTune on the resulting dataset, you can get impact estimates as a function of customer features, allowing precise targeting by impact in the next iteration. CausalTune also serves as a variance reduction method leveraging the availability of any additional features. Example notebook

2. Continuous testing combined with exploitation: (Dynamic) uplift modelling

The per-customer impact estimates, even if noisy, can be used to implement per-customer Thompson sampling for new customers, biasing random treatment assignment towards ones we think are most likely to work. As we still control the per-customer propensity to treat, same methods as above can be applied to keep refining our impact estimates.

Thus, there is no need to either wait for the test to gather enough data for significance, nor to ever end the test, before using its results to assign the most impactful treatment (based on our knowledge so far) to each customer.

As in this case the propensity to treat is known for each customer, we allow to explicitly supply it as a column to the estimators, instead of estimating it from the data like in other cases.

3. Estimate the benefit of smarter (but still partially random) assignment compared to fully random without the need for an actual fully random test group

The previous section described using causal estimators to bias treatment assignment towards the choice we think is most likely to work best for a given customer.

However, after the fact we would like to know the extra benefit of that compared to a fully random assignment. The ERUPT technique sample notebook re-weights the actual outcomes to produce an unbiased estimate of the average outcome that a fully random assignment would have yielded, with no actual additional group needed.

4. Observational inference

The traditional application of causal inference. For example, estimating the impact on volumes and churn likelihood of the time it takes us to answer a customer query. As the set of customers who have support queries is most likely not randomly sampled, confounding corrections are needed.

As with other usecases, the advanced causal inference models allow impact estimation as a function of customer features, rather than just averages, under the assumption that all relevant confounders are observed.

To use this, just set propensity_model to an instance of the desired classifier when instantiating CausalTune, or to "auto" if you want to use the FLAML classifier (the default setting is "dummy" which assumes random assigment and infers the assignment probability from the data). Example notebook

If you have reason to suppose unobserved confounders, such as customer intent (did the customer do a lot of volume because of the promotion, or did they sign up for the promotion because they intended to do lots of volume anyway?) consider looking for an instrumental variable instead.

Note that our derivation of energy score as a valid out-of-sample score for causal models is strictly speaking not applicable for this usecase, but still appears to work reasonably well in practice.

5. IV models: Impact of customer choosing to use a feature

Instrumental variable (IV) estimation to avoid an estimation bias from unobserved confounders.

A natural use case for IV models is making a feature or a promotion available to a customer, and trying to measure the impact of the customer actually choosing to use the feature (the impact of making the feature available can be solved with 1. and 2. above).

Here we use feature availability as an instrumental variable (assuming its assignment to be strictly randomized), and search over IV models in EconML to estimate the impact of the customer choosing to use it. To score IV model fits out of sample, we again use the energy score. Example notebook

Please be aware we have not yet extensively used the IV model fitting functionality internally, so if you run into any issues, please report them!

Installation

To install from source, see For Developers section below.

Requirements
CausalTune works with Python 3.8 and 3.9.

It requires the following libraries to work:

  • NumPy
  • Pandas
  • EconML
  • DoWhy
  • Scikit-Learn

If you run into any problems, try installing the dependencies manually:

pip install -r requirements.txt

Quick Start

The CausalTune package can be used like a scikit-style estimator:

from causaltune import CausalTune
from causaltune.datasets import synth_ihdp

# prepare dataset
data = synth_ihdp()
data.preprocess_dataset()


# init CausalTune object with chosen metric to optimise
ct = CausalTune(time_budget=600, metric="energy_distance")

# run CausalTune
ct.fit(data)

# return best estimator
print(f"Best estimator: {ct.best_estimator}")

Supported Models

The package supports the following causal estimators:

  • Meta Learners:
    • S-Learner
    • T-Learner
    • X-Learner
    • Domain Adaptation Learner
  • DR Learners:
    • Forest DR Learner
    • Linear DR Learner
    • Sparse Linear DR Learner
  • DML Learners:
    • Linear DML
    • Sparse Linear DML
    • Causal Forest DML
  • Ortho Forests:
    • DR Ortho Forest
    • DML Ortho Forest
  • Transformed Outcome

Supported Metrics

We support a variety of different metrics that quantify the performance of a causal model:

  • Energy distance
  • ERUPT (Expected Response Under Proposed Treatments)
  • Qini coefficient and AUC (area under curve)
  • ATE (average treatment effect)

Citation

If you use CausalTune in your research, please cite us as follows:

Timo Debono, Julian Teichgrรคber, Timo Flesch, Edward Zhang, Guy Durant, Wen Hao Kho, Mark Harley, Egor Kraev. CausalTune: A Python package for Automated Causal Inference model estimation and selection. https://github.com/py-why/causaltune. 2022. Version 0.x You can use the following BibTex entry:

@misc{CausalTune,
  author={Timo Debono, Julian Teichgr\"aber, Timo Flesch, Edward Zhang, Guy Durant, Wen Hao Kho, Mark Harley, Egor Kraev},
  title={{CausalTune}: {A Python package for Automated Causal Inference model estimation and selection}},
  howpublished={https://github.com/py-why/causaltune},
  note={Version 0.x},
  year={2022}
}

For Developers

Installation from source

We use Setuptools for building and distributing our package. To install the latest version from source, clone this repository and run the following command from the top-most folder of the repository

pip install -e .

Testing

We use PyTest for testing. If you want to contribute code, make sure that the tests run without errors.

Contribution

See the Contribution file for contribution licensing and code guidelines.

More Repositories

1

dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Python
7,084
star
2

EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
Jupyter Notebook
3,699
star
3

causal-learn

Causal Discovery in Python. It also includes (conditional) independence tests and score functions.
Python
1,142
star
4

dodiscover

[Experimental] Global causal discovery algorithms
Python
84
star
5

pywhy-graphs

[Experimental] Causal graphs that are networkx-compliant for the py-why ecosystem.
Python
46
star
6

pywhy-stats

Python package for (conditional) independence testing and statistical functions related to causality.
Python
20
star
7

py-why.github.io

Contains the code for https://py-why.github.io/
HTML
8
star
8

pywhy-notes

Keep track of discussions and meeting minutes.
5
star
9

graphs

[Not used] Now, an open PR for mixed-edge graph support is open in networkx
Python
2
star
10

governance

This repository describes the governance model for the PyWhy org
1
star
11

dowhy-example-notebooks-deps-dockerfile

Dockerfile
1
star