• Stars
    star
    214
  • Rank 184,644 (Top 4 %)
  • Language Stata
  • License
    MIT License
  • Created about 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Stata commands designed for Impact Evaluations in particular, but also data work in general

ietoolkit - Stata Commands for Impact Evaluations

Install and Update

Installing published versions of ietoolkit

To install ietoolkit, type ssc install ietoolkit in Stata. This will install the latest published version of ietoolkit. The main version of the code in the repo (the master branch) is what is published on SSC as well.

If you think something is different in version in this repo, and the version installed on your computer, make sure that you both look at the master branch in this repo, and that you have the most recent version of ietoolkit installed. To update all files associated with ietoolkit type adoupdate ietoolkit, update in Stata. (It is wise to be in the habit of regularly checking if any of your .ado files installed in Stata need updates by typing adoupdate.)

When we are publishing new versions of ietoolkit then there could be a discrepancy between the master branch and the version on SSC as the master branch is updates a couple of days before. You can confirm if that could be the case by checking if we recently published a new release.

Installing unpublished branches of this repository

Follow the instructions above if you want the most recent published version of ietoolkit. If you want a yet to be published version of ietoolkit then you can use the code below. The code below installs the version currently in the master branch, but replace master in the URL below with the name of the branch you want to install from. You can also install older version of ietoolkit like this but it will only go back to January 2019 when we set up this method of installing the package.

    net install ietoolkit , from("https://raw.githubusercontent.com/worldbank/ietoolkit/master/src") replace

Requirements

Stata version 11 or later is required for this package of commands.

Background

These commands are developed by people that work at or with the Development Impact Evaluations (DIME) unit at the World Bank. While the commands are developed with best practices for impact evaluations in mind, these commands can be useful outside that field as well.

Bug Reports and Feature Requests

If you are familiar with GitHub go to the Contributions section below for advanced instructions.

An easy but still very efficient way to provide any feedback on these commands is to create an issue in GitHub. You can read issues submitted by other users or create a new issue in the top menu below worldbank/ietoolkit at https://github.com/worldbank/ietoolkit. While the word issue has a negative connotation outside GitHub, it can be used for any kind of feedback. If you have an idea for a new command, or a new feature on an existing command, creating an issue is a great tool for suggesting that. Please read already existing issues to check whether someone else has made the same suggestion or reported the same error before creating a new issue.

While we have a slight preference for receiving feedback here on GitHub, you are still very welcome to send a regular email with your feedback to [email protected].

Content

ietoolkit provides a set of commands that address different aspects of data management and data analysis in relation to Impact Evaluations. The list of commands will be extended continuously, and suggestions for new commands are greatly appreciated. Some of the commands are related to standardized best practices developed at DIME (The World Bank’s unit for Impact Evaluations). For these commands, the corresponding help files provide justifications for the standardized best practices applied.

  • ietoolkit returns meta info on the version of ietoolkit installed. Can be used to ensure that the team uses the same version.
  • iebaltab is a tool for multiple treatment arm balance tables
  • ieddtab is a tool for difference-in-difference regression tables
  • ieboilstart standardizes the boilerplate code at the top of all do-files
  • iefolder sets up project folders and master do-files according to DIME's recommended folder structure
  • iegitaddmd adds placeholder README.md files to all empty subfolders allowing them to be synced on GitHub
  • iematch is an algorithm for matching observations in one group to the "most similar" observations in another group
  • iegraph produces graphs of estimation results in common impact evaluation regression models
  • iedropone drops observations and controls that the correct number was dropped
  • ieboilsave performs checks before saving a data set

Contributions

If you are not familiar with GitHub see the Bug reports and feature requests section above for a less technical but still very helpful way to contribute to ietoolkit.

GitHub is a wonderful tool for collaboration on code. We appreciate contributions directly to the code and will of course give credit to anyone providing contributions that we merge to the master branch. If you have any questions on anything in this section, please do not hesitate to email [email protected]. See CONTRIBUTING.md for some more details on for example naming conventions.

The Stata files on the master branch are the files most recently released on the SSC server. README, LICENSE and similar files are updated directly to master in between releases. Check out any of the develop branches (if there are any) if you want to see what future updates we are currently working on.

Please make pull requests to the master branch only if you wish to contribute to README, LICENSE or similar meta data files. If you wish to make a contribution to any Stata file, then please do not use the master branch. If you wish to make a contribution to any Stata files that we have published at least once, then please fork from and make your pull request to the develop branch. The develop branch includes all minor edits we have made to already published commands since the last release that we will include in the next version released on the SSC server. If your addition is related to a specific issue in this repository, then see the naming convention in the CONTRIBUTING.md file.

All Stata commands we are working on that we have yet to release a first version of, are found in the branches called develop-NAME where NAME corresponds to the working name of the command that is yet to be published. If you wish to contribute to any of those commands, then please fork from the branch of the command you want to contribute to, and only make edits to the .ado/.do and .sthlp that correspond to that command. If you want to make contributions to multiple commands that have yet to be released, then you will have to fork from and make pull request to multiple branches.

If you wish to make a contribution by making forks and pull requests but are not exactly sure how to do so, feel free to send an email to [email protected].

License

ietoolkit is developed under MIT license. See http://adampritchard.mit-license.org/ or see the LICENSE file for details.

Main Contact

Luiza Cardoso de Andrade ([email protected])

Authors

Kristoffer Bjärkefur, Luiza Cardoso de Andrade, Benjamin Daniels, Mrijan Rimal

About us

DIME is the World Bank's impact evaluation department. Part of DIME’s mission is to intensify the production of and access to public goods that improve the quantity and quality of global development research, while lowering the costs of doing IE for the entire research community. This Library is developed and maintained by DIME Analytics. DIME Analytics supports quality research processes across the DIME portfolio, offers public trainings, and develops tools for the global community of development researchers.

Other DIME Analytics public goods are:

More Repositories

1

stata

Stata Commands for Data Management and Analysis
258
star
2

REaLTabFormer

A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Jupyter Notebook
210
star
3

dime-r-training

Dime Analytics R Training
HTML
110
star
4

sdgatlas2018

Replication code for the World Bank Atlas of Sustainable Development Goals 2018
R
104
star
5

stata-visual-library

Inspiration and code for data visualizatio in Stata, created and maintained by DIME Analytics.
Stata
78
star
6

stata-tables

Code and writing for blogpost about Stata tables
TeX
70
star
7

dime-data-handbook

Development Research in Practice: The DIME Analytics Data Handbook. By Kristoffer Bjärkefur, Luíza Cardoso de Andrade, Benjamin Daniels, and Maria Jones
TeX
63
star
8

DIME-Resources

Repo for all the DIME Analytics/DIME resources like trainings and all.
55
star
9

ML-classification-algorithms-poverty

A comparative assessment of machine learning classification algorithms applied to poverty prediction
Jupyter Notebook
51
star
10

ml4dev

Machine Learning for Development: A method to Learn and Identify Earth Features from Satellite Images
Python
50
star
11

Stata-IE-Visual-Library

This is a repository maintained by DIME Analytics and containing example graphs on how to explore data sets and display results of Impact Evaluations using Stata. For information on how to contribute to the library and download codes and data sets, click on the link to GitHub below.
Jupyter Notebook
49
star
12

Python-for-Data-Science

Jupyter Notebook
48
star
13

llm4data

LLM4Data is a Python library designed to facilitate the application of large language models (LLMs) and artificial intelligence for development data and knowledge discovery.
Python
46
star
14

r-econ-visual-library

This is a repository maintained by DIME Analytics and containing example graphs on how to create graphs for data analysis of Impact Evaluations using R.
HTML
45
star
15

dime-standards

Repository with resources for DIME's research standards and coding standards
TeX
41
star
16

GOST_PublicGoods

Jupyter Notebook
40
star
17

DIME-LaTeX-Templates

DIME's LaTeX templates and LaTeX exercises teaching anyone new to LaTeX how to use LaTeX and how to use DIME's templates
TeX
40
star
18

iefieldkit

Stata commands designed for Impact Evaluations field work. These are tools that are used during/after a survey in the field for data quality monitoring.
Stata
39
star
19

covid19-agent-based-model

This repository contains the Python implementation of the agent-based model used to model the spread of COVID-19.
Jupyter Notebook
38
star
20

SPI

Repository containing raw data, code, and final output for the Statistical Performance Indicators project of the World bank
HTML
35
star
21

GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings
Python
34
star
22

OpenNightLights

Collection of tools and training materials for exploring the open Nighttime Lights repository
Jupyter Notebook
32
star
23

LearningPoverty

Learning Poverty: an indicator with global coverage that combines schooling and learning.
Stata
31
star
24

wbgviz

Several R packages for World Bank-standard visualisations, building on ggplot2
R
30
star
25

stata-linter

Python
29
star
26

blackmarblepy

Georeferenced Rasters and Statistics of Nightlights from NASA Black Marble
Jupyter Notebook
28
star
27

econometrics-sandbox

This repository contains the code that creates the dashboards references in the “Econometrics Sandbox” blogpost series publish in the Development Impact blog (https://blogs.worldbank.org/impactevaluations)
R
28
star
28

debt-data

Projects related to the World Bank's Debt Statistics
HTML
25
star
29

covid-mobile-data

The COVID19 Mobility Task Force will use data from Mobile Network Operators (MNOs) to support data-poor countries with analytics on mobility to inform mitigation policies for preventing the spread of COVID-19
Python
24
star
30

template

🎩 Project Template
Jupyter Notebook
21
star
31

GEE_Zonal

Collection of python tools for running zonal stats on Google Earth Engine layers
Jupyter Notebook
19
star
32

dime-github-trainings

Training materials and other GitHub related information developed by DIME Analytics
TeX
19
star
33

DIMEwiki

Sample code for impact evaluation and survey
JavaScript
19
star
34

GOST_SAR

Collection of tools developed by GOST team for extracting information from SAR data
Jupyter Notebook
19
star
35

GOSTnets

Convenience wrapper for networkx analysis using geospatial information, focusing on OSM
Jupyter Notebook
17
star
36

blackmarbler

Georeferenced Rasters and Statistics of Nighttime Lights from NASA Black Marble
HTML
16
star
37

dime-python-training

DIME's Python Training for advanced R/Stata users
Jupyter Notebook
16
star
38

iQual

iQual is a package that leverages natural language processing to scale up interpretative qualitative analysis. It also provides methods to assess the bias, interpretability and efficiency of the machine-enhanced codes. iQual has been applied to analyse interviews on parents' aspirations for their children in Cox's Bazaar, Bangladesh.
Jupyter Notebook
15
star
39

dec-python-course

Jupyter Notebook
14
star
40

gld

This is the repository for the Global Labor Database (GLD). It aims to contain all necessary information to understand what the GLD is and how it functions. It does not, however, contain any microdata. For any questions please contact the Focal Point ([email protected]).
Stata
14
star
41

GOST_AIS

Process automatic identification system (AIS) shipping data for various development purposes
Jupyter Notebook
13
star
42

wb-nlp-apps

This repository contains the NLP modeling components and web application implementations of a project for knowledge and data discovery funded by the Knowledge for Change Program (KCP) and the Joint Data Center on Forced Displacement (JDC).
Jupyter Notebook
13
star
43

pipr

R client to the PIP API
R
12
star
44

rio-safe-space

This repository contains the supplemental material and replication package for the 2019 Working Paper "Demand for 'Safe Spaces': Avoiding Harassment and Stigma" by Florence Kondylis, Arianna Legovini, Kate Vyborny, Astrid Zwager, and Luiza Andrade.
Stata
11
star
45

DIA-toolkit

This repository contains all the program codes developed in the "Distributional Impact Analysis: Toolkit and Illustrations of Impacts Beyond the Average Treatment Effect" by Guadalupe Bedoya (World Bank), Luca Bittarello (Northwestern University), Jonathan Davis (University of Chicago), and Nikolas Mittag (CERGE-EI).
Stata
11
star
46

GLAD

Global Learning Assessment Database: a collection of harmonized learning assessments datasets at the student and country level.
Stata
11
star
47

school-survey

Joint UNESCO, UNICEF, WBG survey on national education responses to COVID-19.
Stata
10
star
48

pip

Stata module to access World Bank’s Global Poverty and Inequality data
Stata
10
star
49

cv4ag

Computer vision application over satellite RGB tiles for agricultural land detection
Python
10
star
50

GOSTurban

GOST's combined tools for urban analysis
Jupyter Notebook
9
star
51

povcalnetR

R client to the Povcalnet API
R
9
star
52

wb-nlp-tools

Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.
Python
9
star
53

python-101

A hour lighting introduction to Python for WBG staff delivered on Data Day on Feb 13
Jupyter Notebook
9
star
54

GEPD

Global Education Policy Dashboard
HTML
8
star
55

CityScan

Collection of data processing scripts to generate the baseline data for the CityScan project
Jupyter Notebook
8
star
56

sdg-metadata

SDG Metadata Translation Pilot
JavaScript
8
star
57

rissk

Identify at-risk interviews directly from your Survey Solutions export files.
Python
8
star
58

SDI-Health

Dissemination of harmonization code and data for SDI Health surveys
Stata
8
star
59

EPM

Electricity Planning Model
GAMS
7
star
60

BDA-with-Python

Jupyter Notebook
7
star
61

TwitterEconomicMonitoring

Collection of training materials to download and draw insights from Twitter data.
Jupyter Notebook
7
star
62

dkanr

General purpose R client to the DKAN Open Data platform
R
7
star
63

wb-reproducible-research-repository

This repository supports the World Bank's Reproducible Research Repository
Stata
6
star
64

GOSTnetsraster

Calculating market access using raster surfaces of friction or travel time
Jupyter Notebook
6
star
65

ethiopia-rsdp-ie

Replication Package for: The Impact of Ethiopia's Road Sector Development Program: Evidence from Satellite Data
R
6
star
66

shiny-trainings-performance-ex

HTML
6
star
67

EduAnalyticsToolkit

EduAnalytics Team Toolkit for Data Management, Documentation and Analytics
Stata
6
star
68

povcalnet

Stata client to the Povcalnet API
Stata
6
star
69

DIME-MSIE-Workshop

To version control and share all lab presentation, code examples etc. for DIME’s Manage Successful Impact Evaluation (MSIE) Workshop (also know as DIME’s Field Coordinator Training)
TeX
6
star
70

HNP

World Bank's Geospatial Team (GOST) support to the Global Practice for Health, Nutrition, and Population.
6
star
71

rsocialwatcher

A Social Data Collector for Facebook Marketing API
R
6
star
72

econberta-econie

Repository hosting the large language model EconBERTa and the annotated dataset EconIE
Python
6
star
73

repkit

A Stata package with tools related to computational reproducibility
Stata
6
star
74

datalibweb

datalibweb - datalibweb is the Stata frontend for the microdata API created by Poverty Global Practice in collaboration with ITS and DECDG to enable users to access data and documentation available in different global, regional and country microdata catalogs at the World Bank.
Stata
6
star
75

qcheck

Stata
5
star
76

institutional-assessment-dashboard

This repository contains the code to create the “global institutional assessment dashboard”
R
5
star
77

intro-to-python

Introduction to Python for Data Science.
Jupyter Notebook
5
star
78

health-equity-diagnostics

Jupyter Notebook
5
star
79

INFRA_SAP

Compilation of national level infrastructure analysis as part of the World Bank's Global Infrastructure Map
Jupyter Notebook
5
star
80

Water-When-It-Counts

Replication files for Water When It Counts: Reducing Scarcity through Irrigation Monitoring in Central Mozambique by Paul Christian, Florence Kondylis, Valerie Mueller, Astrid Zwager and Tobias Siegfried
Stata
5
star
81

geometatool

Geospatial Metadata Toolkit
HTML
4
star
82

GSS_Census_Tools

Collected tools for improving the EA demarcation workflow
Jupyter Notebook
4
star
83

SARMD_guidelines

Technical guidelines for the SAR microdata base
Stata
4
star
84

fin2ddh

R
4
star
85

NTL_Harmonizer

Jupyter Notebook
4
star
86

dime-stata-training

HTML
4
star
87

primus

Stata package to manage PRIMUS system
Stata
4
star
88

Firms-Web-Scraping

The aim of this project is to scrape metadata of business firms given only their name and country where they are operating.
Python
4
star
89

GEEST

Gender Enabling Environments Spatial Tool (GEEST)
QML
4
star
90

Worldwide-Bureaucracy-Indicators

Do files used to create Worldwide Bureaucracy Indicators
Stata
3
star
91

geolocation-twitter-urban-planning

This repository contains the code for the analysis and the reproducibility package for the paper "Applying machine learning and geolocation techniques to social media data (Twitter) to develop a resource for urban planning"
R
3
star
92

GeospatialFCVcollateral

Links to additional resources related to the Geospatial and ICT in FCV course on the Open Learning Campus (under construction)
HTML
3
star
93

climateknowledgeportal

Climate Change Knowledge Portal Documentation
Jupyter Notebook
3
star
94

RAG-Based-ChatBot-Example

Python
3
star
95

worldex

WorldEx Application for subnational data indexing and discovery.
Jupyter Notebook
3
star
96

CoVID_density_hotspot_mapping

Identify potential hotspots for CoVID spread due to population density, building heights, and access to services
Jupyter Notebook
2
star
97

terridev_GSG

Stata
2
star
98

SDG-big-data

HTML
2
star
99

LSMS

Jupyter Notebook
2
star
100

PIP-Methodology

Methodology page for the Poverty and Inequality Platform.
TeX
2
star