• Stars
    star
    630
  • Rank 71,328 (Top 2 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

How to do Bayesian statistical modelling using numpy and PyMC3

If you're taking this tutorial at SciPy 2022, please pull the repository 9am CT the day of the tutorial to make sure that you have the most recent version!

bayesian-stats-modelling-tutorial

Binder

How to do Bayesian statistical modelling using numpy and PyMC3.

for conference tutorial attendees

If you're looking for the material for a specific conference tutorial, navigate to the notebooks directory and look for a subdirectory for the conference you're interested. For example, notebooks/ODSC-East-2020-04-14 contains the material for Hugo's ODSC East tutorial on April 14, 2020.

getting started

To get started, first identify whether you:

  • Would like to run the tutorial material on servers hosted elsewhere, to avoid installation,
  • Prefer to use the conda package manager (which ships with the Anaconda distribution of Python),
  • Prefer to use pipenv, which is a package authored by Kenneth Reitz for package management with pip and virtualenv, or
  • Only want to view the website version of the notebooks.

To run the tutorial material on servers elsewhere

Binder

To do this, click on the Binder badge above. This will spin up the necessary computational environment for you so you can write and execute Python code from the comfort of your browser. It is a free service. Due to this, the resources are not guaranteed, though they usually work well. If you want as close to a guarantee as possible, follow the instructions below to set up your computational environment locally (that is, on your own computer).

1. Clone the repository locally

In your terminal, use git to clone the repository locally.

git clone https://github.com/ericmjl/bayesian-stats-modelling-tutorial

Alternatively, you can download the zip file of the repository at the top of the main page of the repository. If you prefer not to use git or don't have experience with it, this a good option.

2. Download Anaconda (if you haven't already)

If you do not already have the Anaconda distribution of Python 3, go get it (note: you can also set up your project environment w/out Anaconda using pip to install the required packages; however Anaconda is great for Data Science and we encourage you to use it).

3. Set up your environment

3a. conda users

If this is the first time you're setting up your compute environment, use the conda package manager to install all the necessary packages from the provided environment.yml file.

conda env create -f binder/environment.yml

To activate the environment, use the conda activate command.

conda activate bayesian-modelling-tutorial

If you get an error activating the environment, use the older source activate command.

source activate bayesian-modelling-tutorial

To update the environment based on the environment.yml specification file, use the conda update command.

conda env update -f binder/environment.yml

3b. pip users

Please install all of the packages listed in the environment.yml file manually. An example command would be:

pip install networkx scipy ...

3c. don't want to mess with dev-ops

If you don't want to mess around with dev-ops, click the following badge to get a Binder session on which you can compute and write code.

Binder

4a. Open your Jupyter notebook

  1. You will have to install a new IPython kernelspec if you created a new conda environment with binder/environment.yml.

    python -m ipykernel install --user --name bayesian-modelling-tutorial --display-name "Python (bayesian-modelling-tutorial)"

You can change the --display-name to anything you want, though if you leave it out, the kernel's display name will default to the value passed to the --name flag.

  1. In the terminal, execute jupyter notebook.

Navigate to the notebooks directory and open the notebook 01-Student-Probability_a_simulated_introduction.ipynb.

4b. Open your Jupyter notebook in Jupyter Lab!

In the terminal, execute jupyter lab.

Navigate to the notebooks directory and open the notebook 01-Student-Probability_a_simulated_introduction.ipynb.

Now, if you're using Jupyter lab, for Notebook 2, you'll need to get ipywidgets working. The documentation is here.

In short, you'll need node installed & you'll need to run the following in your terminal:

jupyter labextension install @jupyter-widgets/jupyterlab-manager

4c. Open your Jupyter notebook using Binder.

Launch Binder using the button at the top of this README.md. Voila!

4d. Want to view static HTML notebooks

If you're interested in only viewing the static HTML versions of the notebooks, the links are provided below:

Part 1: Bayesian Data Science by Simulation

Part 2: Bayesian Data Science by Probabilistic Programming

Acknowledgements

Development of this type of material is almost always a result of years of discussions between members of a community. We'd like to thank the community and to mention several people who have played pivotal roles in our understanding the the material: Michael Betancourt, Justin Bois, Allen Downey, Chris Fonnesbeck, Jake VanderPlas. Also, Andrew Gelman rocks!

Feedback

Please leave feedback for us here! We'll use this information to help improve the teaching and delivery of the material.

data credits

Please see individual notebooks for dataset attribution.

Further Reading & Resources

Further reading resources that are not specifically tied to any notebooks.

More Repositories

1

Network-Analysis-Made-Simple

An introduction to network analysis and applied graph theory using Python and NetworkX
Jupyter Notebook
977
star
2

bayesian-analysis-recipes

A collection of Bayesian data analysis recipes using PyMC3
Jupyter Notebook
544
star
3

nxviz

Visualization Package for NetworkX
Python
454
star
4

essays-on-data-science

In which I put together my thoughts on the practice of data science.
Dockerfile
231
star
5

dl-workshop

Crash course to master gradient-based machine learning. Also secretly a JAX course in disguise!
Jupyter Notebook
200
star
6

bayesian-deep-learning-demystified

In which I try to demystify the fundamental concepts behind Bayesian deep learning.
CSS
118
star
7

data-testing-tutorial

A short tutorial for data scientists on how to write tests for code + data.
Jupyter Notebook
116
star
8

hiveplot

Hive Plots in using Python & matplotlib!
Jupyter Notebook
69
star
9

bayesian-stats-talk

Doing Bayesian statistics in Python!
Jupyter Notebook
65
star
10

protein-interaction-network

Computes a molecular graph for protein structures.
Python
58
star
11

minimal-flask-example

The simplest complex example that I can think of to show main Flask app concepts.
HTML
46
star
12

causality

In which I play with the ideas surrounding causality
Python
45
star
13

flu-sequence-predictor

An experimental deep learning & genotype network-based system for predicting new influenza protein sequences.
Jupyter Notebook
34
star
14

minimal-streamlit-example

A minimal example of how to use streamlit on Heroku
Python
21
star
15

pyds-cli

Helping you manage your data science projects sanely.
Python
18
star
16

llamabot

Pythonic class-based interface to LLMs
Python
17
star
17

conda-envs

My conda environment YAML files
16
star
18

distributions

Central repository for my distributions figures
Jupyter Notebook
16
star
19

Circos

Jupyter Notebook
15
star
20

fundl

A pedagogical, functional-oriented deep learning library built on top of jax.
Python
15
star
21

scikit-learn-tutorial

Jupyter Notebook
14
star
22

minimal-panel-app

A pedagogical implementation of panel apps served up on a remote machine.
Jupyter Notebook
14
star
23

bayesian-generalized-abcde-testing

PyCon 2019 talk on Bayesian multi-group testing.
Jupyter Notebook
9
star
24

pyflatten

A utility for flattening nested data structures into an array.
Python
9
star
25

what-are-probability-distributions

PyCon 2020 Talk on "what probability distributions are"
Python
9
star
26

principled-ds-workflow

Delivered at PyData Boston on 21 July 2020
8
star
27

resume

Building a resume using nothing but YAML files and Python. A prototype.
HTML
8
star
28

ericmjl.github.io

HTML
7
star
29

probability-distributions-with-python

A talk on what probability distributions are, using Python
Python
7
star
30

graph-fingerprint

A package for using convolutional neural nets to learn a graph fingerprint.
Jupyter Notebook
6
star
31

iacs2017

Materials for IACS 2017 contest.
Jupyter Notebook
6
star
32

probabilistic-programming-tutorial

6
star
33

systems-microbiology-hiv

Machine learning and phylogenetics on HIV
Jupyter Notebook
6
star
34

graph-deep-learning-demystified

An attempt at demystifying graph deep learning
HTML
6
star
35

score-models

In which I learn about score functions and how they can be used to generate data.
Jupyter Notebook
6
star
36

website

Eric Ma's Personal Website
HTML
5
star
37

dotfiles

my dotfiles
Shell
5
star
38

worship-manager

Open source software for worship coordinators and leaders.
JavaScript
5
star
39

testing-for-data-scientists

Slides for my talk on testing for data scientists.
Shell
5
star
40

matplotlib-tutorial

A short tutorial on how to make matplotlib plots.
Jupyter Notebook
4
star
41

target-prediction

In which I try to replicate the main findings of Ferrero, E., Dunham, I., & Sanseau, P. (2017), Journal of Translational Medicine, 15(1), 182.
Jupyter Notebook
4
star
42

normalizing-flows

Deeply learning about normalizing flows.
Jupyter Notebook
4
star
43

czbiohub

TeX
4
star
44

autograd-cupy

Autograd wrapper for CuPy
Python
4
star
45

software-testing-open-source-and-data-science

Software Testing in Open Source and Data Science: A talk delivered at the Data Umbrella speaker series
3
star
46

curve-fitting-talk

"Fret not, it's curve fitting all the way down!
Jupyter Notebook
3
star
47

insight-data-challenges

Jupyter Notebook
3
star
48

influenza-reassortment-detector

Scripts for running the influenza reassortment detector
Python
3
star
49

thesis

PhD thesis!!!!!
TeX
3
star
50

emailme

A Python module to email myself from Python scripts and the command line.
Python
2
star
51

influenza-reassortment-analysis

Python
2
star
52

dream-respiratory-viral-challenge

Python
2
star
53

hiv-resistance-prediction

In which I try to use ML models to predict HIV resistance phenotypes.
Jupyter Notebook
2
star
54

internet-monitor

A Streamlit app that monitors internet locally
Python
2
star
55

Primer-Design-Automator

A tool for automating my primer design workflow
Python
2
star
56

beast-gpu-tutorial

A short website that describes how to create an Amazon AWS GPU instance that runs BEAST + BEAGLE.
HTML
2
star
57

habit-tracker

Personal Flask app for tracking a habit.
HTML
2
star
58

bayesian-measurement-paper

My academic 'rant' on why n=3 is not sufficient.
Jupyter Notebook
2
star
59

protein-convolutional-nets

Part of my thesis work. Doing convolutional neural nets on protein graphs to make predictions.
Jupyter Notebook
2
star
60

protein-systematic-characterization

All our protocols, data, analysis, and papers related to this project are stored here.
Jupyter Notebook
2
star
61

continuous-pull

A command-line utility to continuously pull Git repository locally.
Python
2
star
62

computational-representations-message-passing

A short technical piece on how message passing on graphs can be simultaneously made efficient and easy to read.
CSS
2
star
63

influenza-global-reassortment

Jupyter notebooks and data - reproducible analysis from reassortment paper
Jupyter Notebook
2
star
64

math-for-programmers-exercises

My exercises answers from Jeremy Kun's book, Mathematics for Programmers.
Jupyter Notebook
2
star
65

nnet-HA

Toy project, in which I train a neural network to predict influenza virus host tropism.
Jupyter Notebook
1
star
66

small-group

A local web app I built to store information about our Bible Study small group and use it to divide us into smaller groups.
Python
1
star
67

polcart

A small utility for converting between polar and cartesian units.
Python
1
star
68

flu-gibson

A tool for designing primers to clone influenza polymerase segments from viral cDNA.
Python
1
star
69

genomic-surveillance-whitepaper

Publicly written white paper on genomic surveillance.
Shell
1
star
70

flu-gibson-webui

A Flask-based UI for the FluGibson package.
HTML
1
star
71

easy-talk-slides-and-notes

CSS
1
star
72

Personal-Scripts-and-Functions

My repository of custom scripts and functions.
Python
1
star
73

tensor-flow-tutorial

In which I teach myself TensorFlow.
Jupyter Notebook
1
star
74

Influenza-Reassortment-Simulation-and-Identification

Python
1
star
75

Influenza-Network-Transmission-Model

Jupyter Notebook
1
star
76

imgdisplay

A Python command-line app for displaying photos as a slideshow in a directory.
Python
1
star
77

mbtools

Molecular Biology Tools
Python
1
star
78

boston-gov-data

Jupyter Notebook
1
star
79

generative-thinking

something cool happening here
Dockerfile
1
star
80

pymc3-models

Default models built on top of PyMC3.
Python
1
star
81

Influenza-RNA-Secondary-Structure-Prediction

Python
1
star
82

reveal-nord-theme

A personal implementation of the Nord theme + other slide utilities for reveal.js slides.
CSS
1
star
83

cookiecutter-data-project

Opinionated and personalized cookie-cutter data project template
1
star
84

tensor-fun

Minimal tensor operations examples. Playing around with higher-dimensional tensors.
Jupyter Notebook
1
star
85

genotype-network

Genotype network software, collaboration with Kyle Yuan.
Python
1
star
86

h9-pb2-global-analysis

1
star
87

Song-Sheet-Transposer

Python
1
star
88

d3-graph

A repository for me to remember how to use d3's force-directed layout API.
CSS
1
star
89

blog-assistant

My personal blogging assistant, built on top of llamabot and GPT4.
Dockerfile
1
star
90

epaper-badge

Code for ePaper display badge
Python
1
star
91

autoencoders

Me playing around with autoencoders. For fun.
Jupyter Notebook
1
star
92

ecdf-guide

An interactive guide to ECDFs.
Jupyter Notebook
1
star
93

flask-sandbox

In which i futz around with Flask, trying to make a random web app that does something.
Python
1
star
94

pandoc-recipes

A curated set of recipes that I've used with pandoc to make all sorts of documents.
Shell
1
star
95

pytorch-playground

In which I play around with PyTorch.
1
star
96

flu-assembler

In which I try to implement my own influenza genome assembler. For funzies.
Jupyter Notebook
1
star
97

autograd-sparse

Autograd wrapper for scipy.sparse
Python
1
star
98

bluetooth-proximity-tracker-calibration

A repository containing all of the raw data and experiments done on the Raspberry Pi bluetooth tracker.
Jupyter Notebook
1
star
99

quarto-scipy24-exercises

1
star
100

cookiecutter-talk

A repository to bootstrap my writing using Markdown, Pandoc, HTML and Reveal.js
CSS
1
star