• Stars
    star
    109
  • Rank 319,077 (Top 7 %)
  • Language
    R
  • Created almost 11 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A statistical framework that serves as a common interface to a large range of models

zelig-logo

Release: CRAN Version CRAN Monthly Downloads CRAN Total Downloads

Development: Project Status: Active - The project has reached a stable, usable state and is being actively developed. Travis (LINUX) Build Status AppVeyor (Windows) Build Status codecov Dev-Blog

Zelig workflow overview

All models in Zelig can be estimated and results explored presented using four simple functions:

  1. zelig to estimate the parameters,

  2. setx to set fitted values for which we want to find quantities of interest,

  3. sim to simulate the quantities of interest,

  4. plot to plot the simulation results.

Zelig 5 reference classes

Zelig 5 introduced reference classes. These enable a different way of working with Zelig that is detailed in a separate vignette. Directly using the reference class architecture is optional. They are not used in the examples below.

Zelig Quickstart Guide

Let’s walk through an example. This example uses the swiss dataset. It contains data on fertility and socioeconomic factors in Switzerland’s 47 French-speaking provinces in 1888 (Mosteller and Tukey, 1977, 549-551). We will model the effect of education on fertility, where education is measured as the percent of draftees with education beyond primary school and fertility is measured using the common standardized fertility measure (see Muehlenbein (2010, 80-81) for details).

Installing and Loading Zelig

If you haven't already done so, open your R console and install Zelig. We recommend installing Zelig with the zeligverse package. This installs core Zelig and ancillary packages at once.

install.packages('zeligverse')

Alternatively you can install the development version of Zelig with:

devtools::install_github('IQSS/Zelig')

Once Zelig is installed, load it:

library(zeligverse)

Building Models

Let’s assume we want to estimate the effect of education on fertility. Since fertility is a continuous variable, least squares (ls) is an appropriate model choice. To estimate our model, we call the zelig() function with three two arguments: equation, model type, and data:

# load data
data(swiss)

# estimate ls model
z5_1 <- zelig(Fertility ~ Education, model = "ls", data = swiss, cite = FALSE)

# model summary
summary(z5_1)

## Model: 
## 
## Call:
## z5$zelig(formula = Fertility ~ Education, data = swiss)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -17.036  -6.711  -1.011   9.526  19.689 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  79.6101     2.1041  37.836  < 2e-16
## Education    -0.8624     0.1448  -5.954 3.66e-07
## 
## Residual standard error: 9.446 on 45 degrees of freedom
## Multiple R-squared:  0.4406, Adjusted R-squared:  0.4282 
## F-statistic: 35.45 on 1 and 45 DF,  p-value: 3.659e-07
## 
## Next step: Use 'setx' method

The -0.86 coefficient on education suggests a negative relationship between the education of a province and its fertility rate. More precisely, for every one percent increase in draftees educated beyond primary school, the fertility rate of the province decreases 0.86 units. To help us better interpret this finding, we may want other quantities of interest, such as expected values or first differences. Zelig makes this simple by automating the translation of model estimates into interpretable quantities of interest using Monte Carlo simulation methods (see King, Tomz, and Wittenberg (2000) for more information). For example, let’s say we want to examine the effect of increasing the percent of draftees educated from 5 to 15. To do so, we set our predictor value using the setx() and setx1() functions:

# set education to 5 and 15
z5_1 <- setx(z5_1, Education = 5)
z5_1 <- setx1(z5_1, Education = 15)

# model summary
summary(z5_1)

## setx:
##   (Intercept) Education
## 1           1         5
## setx1:
##   (Intercept) Education
## 1           1        15
## 
## Next step: Use 'sim' method

After setting our predictor value, we simulate using the sim() method:

# run simulations and estimate quantities of interest
z5_1 <- sim(z5_1)

# model summary
summary(z5_1)

## 
##  sim x :
##  -----
## ev
##       mean       sd      50%     2.5%    97.5%
## 1 75.30616 1.658283 75.28057 72.12486 78.48007
## pv
##          mean       sd      50%     2.5%   97.5%
## [1,] 75.28028 9.707597 75.60282 57.11199 94.3199
## 
##  sim x1 :
##  -----
## ev
##       mean       sd      50%     2.5%    97.5%
## 1 66.66467 1.515977 66.63699 63.66668 69.64761
## pv
##          mean       sd      50%     2.5%    97.5%
## [1,] 66.02916 9.441273 66.32583 47.19223 82.98039
## fd
##        mean       sd       50%      2.5%     97.5%
## 1 -8.641488 1.442774 -8.656953 -11.43863 -5.898305

At this point, we’ve estimated a model, set the predictor value, and estimated easily interpretable quantities of interest. The summary() method shows us our quantities of interest, namely, our expected and predicted values at each level of education, as well as our first differences–the difference in expected values at the set levels of education.

Visualizations

Zelig’s plot() function plots the estimated quantities of interest:

plot(z5_1)

We can also simulate and plot simulations from ranges of simulated values:

z5_2 <- zelig(Fertility ~ Education, model = "ls", data = swiss, cite = FALSE)

# set Education to range from 5 to 15 at single integer increments
z5_2 <- setx(z5_2, Education = 5:15)

# run simulations and estimate quantities of interest
z5_2 <- sim(z5_2)

Then use the plot() function as before:

z5_2 <- plot(z5_2)

Getting help

The primary documentation for Zelig is available at: http://docs.zeligproject.org/articles/.

Within R, you can access function help using the normal ? function, e.g.:

?setx

If you are looking for details on particular estimation model methods, you can also use the ? function. Simply place a z before the model name. For example, to access details about the logit model use:

?zlogit

Building Zelig (for developers)

Zelig can be fully checked and build using the code in check_build_zelig.R. Note that this can be time consuming due to the extensive test coverage.

More Repositories

1

dataverse

Open source research data repository software
Java
849
star
2

dss-workshops-archived

IQSS Data Science Services workshop materials
HTML
78
star
3

IQSS.emacs

Yet Another .emacs.d
HTML
74
star
4

open-source-at-harvard

Listing and exploring open source software written at Harvard
Go
63
star
5

Amelia

Amelia: A Package for Missing Data
R
62
star
6

dataverse-client-r

R Client for Dataverse Repositories
R
61
star
7

dataverse-docker

Dataverse 6.2 on Docker with integrated services called "Archive in a box" and could be used both as demo and production system and easily integrated with other services.
Shell
57
star
8

datafest

Materials and code for Harvard DataFest workshop
HTML
37
star
9

dataverse-client-python

Python library for writing clients that use APIs from Dataverse
Python
31
star
10

prefresher

Math Prefresher Text
R
30
star
11

TwoRavens

See https://github.com/tworavens/tworavens for current repository for this project and http://2ra.vn for project pages.
JavaScript
30
star
12

redmine2github

Scripts to migrate redmine tickets to github issues
Python
26
star
13

dvn

Dataverse Network (DVN) 3.x, distinct from the newer code base at https://github.com/IQSS/dataverse
Java
24
star
14

iqss-beamer-theme

A Beamer theme featuring IQSS orange.
TeX
23
star
15

clarify

clarify: Simulation-Based Inference for Regression Models
HTML
20
star
16

WhatIf

WhatIf: Software for Evaluating Counterfactuals
R
17
star
17

cem

R
17
star
18

dataverse.org

The code that used to power the http://dataverse.org website, distinct from the repository software at https://github.com/IQSS/dataverse
JavaScript
17
star
19

dataverse-frontend

An upcoming and modernized UI for Dataverse
TypeScript
16
star
20

dataverse-uploader

GitHub Action to publish repository content on Dataverse
Python
15
star
21

Zelig4

Old version of the statistical package Zelig, v4.x. New version is in the Zelig repo.
TeX
15
star
22

dataverse-client-javascript

A Dataverse client for JavaScript and TypeScript
TypeScript
15
star
23

dss-workshops

Data science workshop materials developed by Data Science Services (IQSS) and Research Computing Services (HBS).
PostScript
15
star
24

dataverse-ansible

This repository is deprecated. Please find the current role in its new location:
14
star
25

social_science_software_toolkit

IQSS best practices and resources for developing social science statistical software
10
star
26

ZeligMultilevel

Five Multi-level Zelig Models
R
8
star
27

DataTaggingLibrary

Language tools for DataTags (Decision graph variant). Including the embeddable runtime engine
Java
8
star
28

dataverse-metrics

Aggregate and visualize metrics for installations of Dataverse around the world
JavaScript
8
star
29

cem-stata

CEM for Stata
TeX
7
star
30

milestone-reader

Quick application to aggregate/display milestones across multiple github repositories (and organizations, if desired)
Python
7
star
31

geoconnect

A connector between Dataverse and WorldMap
Python
6
star
32

dataverse-sample-data

Scripts and sample data for demo purposes
Python
6
star
33

dataverse-installations

code that powers a map of Dataverse installations around the world
JavaScript
6
star
34

dss-webscrape

Tutorial for scraping dynamic websites using Selenium Python.
PowerShell
6
star
35

askdataverse

Experimental code leveraging LLM applications within Dataverse
Python
5
star
36

docs.zeligproject.org

The documentation generated from Zelig models
HTML
5
star
37

dataverse-client-java

Java
5
star
38

datafest-2021

Jupyter Notebook
5
star
39

dataverse.harvard.edu

Custom code for dataverse.harvard.edu and an issue tracker for the IQSS Dataverse team's operational work, for better tracking on https://github.com/orgs/IQSS/projects/34
HTML
5
star
40

IQSSdevtools

Experimental package for making it easy to initialize and compile R packages complying with IQSS Best Practices
R
4
star
41

iqss-metrics-backend

Python
4
star
42

UNF

Universal Numerical Fingerprint
Java
4
star
43

RobustSE

R package implementing the generalized information matrix (GIM) test to detect model misspecification
R
4
star
44

vscode-settings

Our recommendations for setting up your Visual Studio Code
4
star
45

zeligverse

Easily install, load, and test Zelig packages
R
3
star
46

anchors

R
3
star
47

dataverse-puppet

Shell
3
star
48

Zelig4Choice

2 Bivariate Choice Regressions, 2 Ordinal Choice Regressions, and 1 Multinomial Choice Regression
R
3
star
49

iqss-javaee-template

Java
2
star
50

dataverse-android

Dataverse for Android
Java
2
star
51

doi2pmh-server

An OAI-PMH server that provides harvesting sets based on a list of DOIs
PHP
2
star
52

dss-template

Boilerplate repo for DSS tutorials/workshops
TeX
2
star
53

data-viz

R and Python Binding for JS visualizations (D3, leaflet) for Data Science. Dataverse-ready visualizations.
JavaScript
2
star
54

YourCast

R
2
star
55

DataTaggingServer

A web server to data-tag datasets (see datatags.org).
Scala
2
star
56

dataverse-ddi-converter-tool

Python
2
star
57

zeligproject.org

Code for the web site for Zelig 5 and beyond
HTML
2
star
58

miniverse

Reference/Debug use: Using the Django ORM to explore the Dataverse database
JavaScript
2
star
59

iqss-metrics-dashboard

Front end for IQSS metrics
CSS
2
star
60

USDataDepositInterview

DataTags data deposit interview for USA (BETA - do not use as a legal advice).
TypeScript
1
star
61

ZeligEI

Module for the Zelig package containing Ecological Inference models.
R
1
star
62

chat.dataverse.org

Jupyter Notebook
1
star
63

TextCleaner

Scala
1
star
64

dss-template-quarto

A template for writing tutorial books using Quarto
1
star
65

selenium-helper

starting point for dataverse selenium scripts (scratch work)
Python
1
star
66

ZeligGAM4

General Additive Models for Zelig
R
1
star
67

Level5_UserGuide

A user guide for using Level 5 workstations configured by HMDC.
1
star
68

ZeligChoice

ZeligChoice
R
1
star
69

Zelig4-Documentation

Developer Documentation
TeX
1
star
70

shared-dataverse-information

Common django model used by geoconnect and the cga-worldmap
Python
1
star
71

ZeligNetwork

Network Regressions for Zelig 4
R
1
star
72

dss-rbuild

R package development tutorial
R
1
star
73

iqss-jupyter-notebook

Shell
1
star
74

geoconnect-tester

Tests for the Dataverse<- geoconnect -> WorldMap connection
Python
1
star
75

ppr-ojs

PHP
1
star