awesome-causality-data
An index of datasets that can be used for learning causality.
Please cite our survey if this data index helps your research.
@article{guo2018survey,
title={A Survey of Learning Causality with Data: Problems and Methods},
author={Guo, Ruocheng and Cheng, Lu and Li, Jundong and Hahn, P. Richard and Liu, Huan},
journal={arXiv preprint arXiv:1809.09337},
year={2018}
}
Updates coming soon
Datasets for Learning Causal Effects (Causal Inference)
Causal Effect Estimation with Single Cause
Datasets with i.i.d. samples
Standard datasets for learning causal effects comes with each instance in the format of (x,d,y).
How is IHDP1 (setting A) simulated
Job Training (Lalonde 1986 in the R package qte)
Datasets with non-i.i.d. samples (with interference, spillover effect or auxiliary network information)
Datasets with instrumental Variables (IV)
Standard datasets for learning causal effects, each instance has the format of (i,x,d,y).
Datasets for Regression Discontinuity Design
Population Threshold RDD Datasets
Datasets with Multiple Causes
Datasets for Learning Causal Relationships (Causal Discovery)
Distinguishing Cause from Effect
Database with cause-effect pairs (Tbingen Cause-Effect Pairs)
Causal Bayesian Network
Lung Cancer Simple Set (LUCAS)