• Stars
    star
    181
  • Rank 212,110 (Top 5 %)
  • Language Stan
  • Created over 5 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Database with posteriors of interest for Bayesian inference

posteriordb Content R-CMD-check Codecov test coverage Python

posteriordb: a database of Bayesian posterior inference

What is posteriordb?

posteriordb is a set of posteriors, i.e. Bayesian statistical models and data sets, reference implementations in probabilistic programming languages, and reference posterior inferences in the form of posterior samples.

Why use posteriordb?

posteriordb is designed to test inference algorithms across a wide range of models and data sets. Applications include testing for accuracy, speed, and scalability. posteriordb can be used to test new algorithms being developed or deployed as part of continuous integration for ongoing regression testing algorithms in probabilistic programming frameworks.

posteriordb also makes it easy for students and instructors to access various pedagogical and real-world examples with precise model definitions, well-curated data sets, and reference posteriors.

posteriordb is framework agnostic and easily accessible from R and Python.

For more details regarding the use cases of posteriordb, see doc/use_cases.md.

Content

See DATABASE_CONTENT.md for the details content of the posterior database.

Contributing

We are happy with any help in adding posteriors, data, and models to the database! See CONTRIBUTING.md for the details on how to contribute.

Using posteriordb

To simplify the use of posteriordb, there are convenience functions both in R and in Python.

Citing posteriordb

Developing and maintaining open-source software is an important yet often underappreciated contribution to scientific progress. Thus, please make sure to cite it appropriately so that developers get credit for their work. Information on how to cite posteriordb can be found in the CITATION.cff file. Use the “cite this repository” button under “About” to get a simple BibTeX or APA snippet.

As posteriordb rely heavily on Stan, so please consider also to cite Stan:

Carpenter B., Gelman A., Hoffman M. D., Lee D., Goodrich B., Betancourt M., Brubaker M., Guo J., Li P., and Riddell A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software. 76(1). 10.18637/jss.v076.i01

Design choices (so far)

The main focus of the database is simplicity, both in understanding and in use.

The following are the current design choices in designing the posterior database.

  1. Priors are hardcoded in model files as changing the prior changes the posterior. Create a new model to test different priors.
  2. Data transformations are stored as different datasets. Create new data to test different data transformations, subsets, and variable settings. This design choice makes the database larger/less memory efficient but simplifies the analysis of individual posteriors.
  3. Models and data has (model/data).info.json files with model and data specific information.
  4. Templates for different JSONs can be found in content/templates and schemas in schemas (Note: these don’t exist right now and will be added later)
  5. Prefix ‘syn_’ stands for synthetic data where the generative process is known and found in content/data-raw.
  6. All data preprocessing is included in content/data-raw.
  7. Specific information for different PPL representations of models is included in the PPL syntax files as comments, not in the model.info.json files.

Versioning of models

We might update models included in posteriordb over time. However, the models will only have the same name in posteriordb if the log density is the same (up to a normalizing constant). Otherwise, we will include a new model in the database.

More Repositories

1

stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
C++
2,589
star
2

rstan

RStan, the R interface to Stan
R
973
star
3

pystan2

PyStan, the Python interface to Stan
Python
918
star
4

example-models

Example models for Stan
HTML
772
star
5

math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
C++
744
star
6

bayesplot

bayesplot R package for plotting Bayesian models
R
431
star
7

rstanarm

rstanarm R package for Bayesian applied regression modeling
R
387
star
8

pystan

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io
Python
270
star
9

stancon_talks

Materials from Stan conferences
HTML
250
star
10

shinystan

shinystan R package and ShinyStan GUI
R
195
star
11

cmdstan

CmdStan, the command line interface to Stan
C++
182
star
12

posterior

The posterior R package
R
167
star
13

loo

loo R package for approximate leave-one-out cross-validation (LOO-CV) and Pareto smoothed importance sampling (PSIS)
R
150
star
14

cmdstanpy

CmdStanPy is a lightweight interface to Stan for Python users which provides the necessary objects and functions to compile a Stan program and fit the model to data using CmdStan.
Python
146
star
15

cmdstanr

CmdStanR: the R interface to CmdStan
R
144
star
16

stanc3

The Stan transpiler (from Stan to C++ and beyond).
OCaml
140
star
17

projpred

Projection predictive variable selection
R
110
star
18

stan-mode

Emacs mode for Stan.
Emacs Lisp
71
star
19

rstantools

Tools for Developing R Packages Interfacing with Stan
R
51
star
20

docs

Documentation for the Stan language and CmdStan
TeX
37
star
21

httpstan

HTTP interface to Stan, a package for Bayesian inference.
Python
35
star
22

design-docs

33
star
23

MathematicaStan

A Mathematica package to interact with CmdStan
Mathematica
27
star
24

connect22-space-time

StanCon Connect 2022 space and time
HTML
24
star
25

stancon2023

Materials for StanCon 2023
Jupyter Notebook
23
star
26

statastan

Stata interface for Stan.
Stata
20
star
27

nomad

Fast autodiff.
C++
18
star
28

gmo

Inference on marginal distributions using gradient-based optimization
R
13
star
29

posteriordb-python

Python
11
star
30

stat_comp_benchmarks

Benchmark Models for Evaluating Algorithm Accuracy
R
9
star
31

posteriordb-r

R
8
star
32

pystan-wheels

Automated builds of OSX and manylinux wheels for pystan
Shell
8
star
33

performance-tests-cmdstan

Performance testing tools for use with CmdStan
Python
8
star
34

perf-math

C++
7
star
35

logos

Stan logos
HTML
5
star
36

r-packages

Repository for distributing (some) stan-dev R packages
4
star
37

httpstan-wheels

Wheels for httpstan
Shell
4
star
38

visual-diagnostics

Visual diagnostics for HMC using gnuplot.
Shell
4
star
39

sgb

Stan Governing Body issue tracker and meeting notes
4
star
40

atom-language-stan

JavaScript
3
star
41

stan2tfp

Stan2TFP is a work-in-progress alternative backend for Stanc3 which targets TensorFlow Probability
OCaml
2
star
42

.github

Stan organization READMEs and information
1
star
43

jenkins-shared-libraries

Libraries for our Jenkinsfiles
Groovy
1
star
44

stan-discourse-theme-component

HTML
1
star
45

propaganda

Sell sheets and the like
TeX
1
star
46

ci-scripts

Formerly syclik's stan-scripts repo. Contains scripts used by Jenkins as well as the release scripts and performance scripts.
Shell
1
star