• Stars
    star
    879
  • Rank 51,943 (Top 2 %)
  • Language
    Fortran
  • License
    Other
  • Created over 9 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

THIS IS THE **OLD** PYMC PROJECT (VERSION 2). PLEASE USE PYMC INSTEAD:

Introduction

Version: 2.3.8
Authors: Chris Fonnesbeck
Anand Patil
David Huard
John Salvatier
Web site:https://github.com/pymc-devs/pymc
Documentation:http://bit.ly/pymc_docs
Copyright: This document has been placed in the public domain.
License:PyMC is released under the Academic Free License.
https://secure.travis-ci.org/pymc-devs/pymc.png http://img.shields.io/pypi/v/pymc.svg?style=flat http://img.shields.io/badge/license-AFL-blue.svg?style=flat

NOTE: The current version PyMC (version 3) has been moved to its own repository called pymc3. Unless you have a good reason for using this package, we recommend all new users adopt PyMC3.

Purpose

PyMC is a python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo. Its flexibility and extensibility make it applicable to a large suite of problems. Along with core sampling functionality, PyMC includes methods for summarizing output, plotting, goodness-of-fit and convergence diagnostics.

Features

PyMC provides functionalities to make Bayesian analysis as painless as possible. Here is a short list of some of its features:

  • Fits Bayesian statistical models with Markov chain Monte Carlo and other algorithms.
  • Includes a large suite of well-documented statistical distributions.
  • Uses NumPy for numerics wherever possible.
  • Includes a module for modeling Gaussian processes.
  • Sampling loops can be paused and tuned manually, or saved and restarted later.
  • Creates summaries including tables and plots.
  • Traces can be saved to the disk as plain text, Python pickles, SQLite or MySQL database, or hdf5 archives.
  • Several convergence diagnostics are available.
  • Extensible: easily incorporates custom step methods and unusual probability distributions.
  • MCMC loops can be embedded in larger programs, and results can be analyzed with the full power of Python.

What's new in version 2

This second version of PyMC benefits from a major rewrite effort. Substantial improvements in code extensibility, user interface as well as in raw performance have been achieved. Most notably, the PyMC 2 series provides:

  • New flexible object model and syntax (not backward-compatible).
  • Reduced redundant computations: only relevant log-probability terms are computed, and these are cached.
  • Optimized probability distributions.
  • New adaptive blocked Metropolis step method.
  • Much more!

Usage

First, define your model in a file, say mymodel.py (with comments, of course!):

# Import relevant modules
import pymc
import numpy as np

# Some data
n = 5 * np.ones(4, dtype=int)
x = np.array([-.86, -.3, -.05, .73])

# Priors on unknown parameters
alpha = pymc.Normal('alpha', mu=0, tau=.01)
beta = pymc.Normal('beta', mu=0, tau=.01)

# Arbitrary deterministic function of parameters
@pymc.deterministic
def theta(a=alpha, b=beta):
    """theta = logit^{-1}(a+b)"""
    return pymc.invlogit(a + b * x)

# Binomial likelihood for data
d = pymc.Binomial('d', n=n, p=theta, value=np.array([0., 1., 3., 5.]),
                  observed=True)

Save this file, then from a python shell (or another file in the same directory), call:

import pymc
import mymodel

S = pymc.MCMC(mymodel, db='pickle')
S.sample(iter=10000, burn=5000, thin=2)
pymc.Matplot.plot(S)

This example will generate 10000 posterior samples, thinned by a factor of 2, with the first half discarded as burn-in. The sample is stored in a Python serialization (pickle) database.

History

PyMC began development in 2003, as an effort to generalize the process of building Metropolis-Hastings samplers, with an aim to making Markov chain Monte Carlo (MCMC) more accessible to non-statisticians (particularly ecologists). The choice to develop PyMC as a python module, rather than a standalone application, allowed the use MCMC methods in a larger modeling framework. By 2005, PyMC was reliable enough for version 1.0 to be released to the public. A small group of regular users, most associated with the University of Georgia, provided much of the feedback necessary for the refinement of PyMC to a usable state.

In 2006, David Huard and Anand Patil joined Chris Fonnesbeck on the development team for PyMC 2.0. This iteration of the software strives for more flexibility, better performance and a better end-user experience than any previous version of PyMC.

PyMC 2.1 was released in early 2010. It contains numerous bugfixes and optimizations, as well as a few new features. This user guide is written for version 2.1.

Relationship to other packages

PyMC in one of many general-purpose MCMC packages. The most prominent among them is WinBUGS, which has made MCMC and with it Bayesian statistics accessible to a huge user community. Unlike PyMC, WinBUGS is a stand-alone, self-contained application. This can be an attractive feature for users without much programming experience, but others may find it constraining. A related package is JAGS, which provides a more UNIX-like implementation of the BUGS language. Other packages include Hierarchical Bayes Compiler and a number of R packages of varying scope.

It would be difficult to meaningfully benchmark PyMC against these other packages because of the unlimited variety in Bayesian probability models and flavors of the MCMC algorithm. However, it is possible to anticipate how it will perform in broad terms.

PyMC's number-crunching is done using a combination of industry-standard libraries (NumPy and the linear algebra libraries on which it depends) and hand-optimized Fortran routines. For models that are composed of variables valued as large arrays, PyMC will spend most of its time in these fast routines. In that case, it will be roughly as fast as packages written entirely in C and faster than WinBUGS. For finer-grained models containing mostly scalar variables, it will spend most of its time in coordinating Python code. In that case, despite our best efforts at optimization, PyMC will be significantly slower than packages written in C and on par with or slower than WinBUGS. However, as fine-grained models are often small and simple, the total time required for sampling is often quite reasonable despite this poorer performance.

We have chosen to spend time developing PyMC rather than using an existing package primarily because it allows us to build and efficiently fit any model we like within a full-fledged Python environment. We have emphasized extensibility throughout PyMC's design, so if it doesn't meet your needs out of the box chances are you can make it do so with a relatively small amount of code. See the testimonials page on the wiki for reasons why other users have chosen PyMC.

Getting started

This guide provides all the information needed to install PyMC, code a Bayesian statistical model, run the sampler, save and visualize the results. In addition, it contains a list of the statistical distributions currently available. More examples of usage as well as tutorials are available from the PyMC web site.

More Repositories

1

pymc

Bayesian Modeling and Probabilistic Programming in Python
Python
8,646
star
2

pymc-resources

PyMC educational resources
Jupyter Notebook
1,938
star
3

pymc4

Experimental PyMC interface for TensorFlow Probability. Official work on this project has been discontinued.
Jupyter Notebook
713
star
4

pytensor

PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.
Python
337
star
5

pymc-examples

Examples of PyMC models, including a library of Jupyter notebooks.
Python
285
star
6

nutpie

Python wrapper for nuts-rs
Jupyter Notebook
107
star
7

sunode

Solve ODEs fast, with support for PyMC
Jupyter Notebook
105
star
8

pymc-bart

Python
87
star
9

pymc-experimental

Jupyter Notebook
77
star
10

symbolic-pymc

Tools for the symbolic manipulation of PyMC models, Theano, and TensorFlow graphs.
Python
61
star
11

nuts-rs

A implementation of NUTS in rust
Rust
56
star
12

uq_chapter

Uncertainty quantification book chapter
CSS
49
star
13

pymc4_prototypes

Experimental code for porting PyMC to alternative backends
Jupyter Notebook
26
star
14

pymc-data-umbrella

Website: Data Umbrella & PyMC open source sessions
Jupyter Notebook
26
star
15

mcbackend

A backend for storing MCMC draws.
Python
14
star
16

pymc.io

PyMC project website and blog!
Jupyter Notebook
14
star
17

video-timestamps

Crowd sourced timestamps for PyMC Youtube videos
7
star
18

pymcon

Website for PyMCon
HTML
5
star
19

pymcon_web_series_website

HTML
5
star
20

pymc_workflow_analyzer

Python
3
star
21

pymc3-experimental

PyMC3 experimental features not ready to be included in PyMC3 (yet)
Python
3
star
22

brand

Branding resources, logos
2
star
23

paper_v5

Jupyter Notebook
2
star
24

pymc-doc

JavaScript
2
star
25

PyMC.tmbundle

TextMate bundle for PyMC
1
star
26

communication

1
star
27

design-notes

1
star
28

pymc-sphinx-theme

A thin sphinx theme to customize pydata-sphinx-theme consistently cross PyMC websites.
HTML
1
star