• Stars
    star
    330
  • Rank 127,657 (Top 3 %)
  • Language
    Jupyter Notebook
  • Created over 10 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for a tutorial on Bayesian Statistics by Allen Downey.

Bayesian Statistics Made Simple

Allen Downey

Bayesian statistical methods are becoming more common, but there are not many resources to help beginners get started. People who know Python can use their programming skills to get a head start.

In this tutorial, I introduce Bayesian methods using grid algorithms, which help develop understanding and prepare for MCMC, which is a powerful algorithm for real-world problems.

It is based on my book, Think Bayes, a class I teach at Olin College, and my blog, “Probably Overthinking It.”

Slides for this tutorial are here.

Installation instructions

Note: Please try to install everything you need for this tutorial before you leave home!

To prepare for this tutorial, you have two options:

  1. Install Jupyter on your laptop and download my code from GitHub.

  2. Run the Jupyter notebooks on a virtual machine on Binder.

I'll provide instructions for both, but here's the catch: if everyone chooses Option 2, the wireless network might not be able to handle the load. So, I strongly encourage you to try Option 1 and only resort to Option 2 if you can't get Option 1 working.

Option 1A: If you already have Jupyter installed.

Code for this workshop is in a Git repository on Github.
You can download it in this zip file. When you unzip it, you should get a directory named BayesMadeSimple.

Or, if you have a Git client installed, you can clone the repo by running:

    git clone https://github.com/AllenDowney/BayesMadeSimple

It should create a directory named BayesMadeSimple.

To run the notebooks, you need Python 3 with Jupyter, NumPy, SciPy, matplotlib and Seaborn. If you are not sure whether you have those modules already, the easiest way to check is to run my code and see if it works.

You will also need a small library I wrote, called empyrical-dist. You can see it on PyPI and you can install it using pip:

    pip install empyrical-dist

To start Jupyter, run:

    cd BayesMadeSimple
    jupyter notebook

Jupyter should launch your default browser or open a tab in an existing browser window. If not, the Jupyter server should print a URL you can use. For example, when I launch Jupyter, I get

    ~/BayesMadeSimple$ jupyter notebook
    [I 10:03:20.115 NotebookApp] Serving notebooks from local directory: /home/downey/BayesMadeSimple
    [I 10:03:20.115 NotebookApp] 0 active kernels
    [I 10:03:20.115 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
    [I 10:03:20.115 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

In this case, the URL is http://localhost:8888.
When you start your server, you might get a different URL. Whatever it is, if you paste it into a browser, you should should see a home page with a list of the notebooks in the repository.

Click on 01_cookie.ipynb. It should open the first notebook for the tutorial.

Select the cell with the import statements and press "Shift-Enter" to run the code in the cell. If it works and you get no error messages, you are all set.

If you get error messages about missing packages, you can install the packages you need using your package manager, or try Option 1B and install Anaconda.

Option 1B: If you don't already have Jupyter.

I highly recommend installing Anaconda, which is a Python distribution that contains everything you need for this tutorial. It is easy to install on Windows, Mac, and Linux, and because it does a user-level install, it will not interfere with other Python installations.

Information about installing Anaconda is here.

Choose the Python 3.7 distribution.

After you install Anaconda, you can install the packages you need like this:

    conda install jupyter numpy scipy matplotlib seaborn
    pip install empyrical-dist

Or you can create a Conda environment just for the workshop, like this:

    cd BayesMadeSimple
    conda env create -f environment.yml
    conda activate BayesMadeSimple

Then go to Option 1A to make sure you can run my code.

Option 2: if Option 1 failed.

You can run my notebook in a virtual machine on Binder. To launch the VM, press this button:

Binder

You should see a home page with a list of the files in the repository.

If you want to try the exercises, open 01_cookie.ipynb. You should be able to run the notebooks in your browser and try out the examples.

However, be aware that the virtual machine you are running is temporary.
If you leave it idle for more than an hour or so, it will disappear along with any work you have done.

Special thanks to the people who run Binder, which makes it easy to share and reproduce computation.

More Repositories

1

ThinkStats2

Text and supporting code for Think Stats, 2nd Edition
Jupyter Notebook
3,899
star
2

ThinkDSP

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.
Jupyter Notebook
3,476
star
3

ThinkPython2

LaTeX source and supporting code for Think Python, 2nd edition, by Allen Downey.
TeX
2,378
star
4

ThinkBayes

Code repository for Think Bayes.
TeX
1,627
star
5

ThinkBayes2

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.
Jupyter Notebook
1,617
star
6

ThinkPython

Code examples and exercise solutions from Think Python by Allen Downey, published by O'Reilly Media.
PostScript
930
star
7

ModSimPy

Text and supporting code for Modeling and Simulation in Python
HTML
818
star
8

ThinkComplexity2

Book and code for Think Complexity, 2nd edition
Jupyter Notebook
728
star
9

ThinkOS

Text and supporting code for Think OS: A Brief Introduction to Operating Systems, by Allen Downey.
TeX
526
star
10

ThinkDataStructures

LaTeX source and supporting code for Think Data Structures: Algorithms and Information Retrieval in Java
TeX
510
star
11

ThinkJavaCode

Supporting code for Think Java by Allen Downey and Chris Mayfield.
Java
364
star
12

ElementsOfDataScience

An introduction to data science in Python, for people with no programming experience.
Jupyter Notebook
358
star
13

LittleBookOfSemaphores

LaTeX source and supporting code for The Little Book of Semaphores, by Allen Downey.
TeX
237
star
14

CompStats

Code for a workshop on statistical interference using computational methods in Python.
Jupyter Notebook
215
star
15

empiricaldist

Python library that represents empirical distribution functions.
Jupyter Notebook
152
star
16

DSIRP

Data Structures and Information Retrieval in Python
Jupyter Notebook
128
star
17

BiteSizeBayes

An introduction to Bayesian statistics using Python and (coming soon) R.
Jupyter Notebook
126
star
18

ThinkCPP

Text and code for Think C++ by Allen Downey
PostScript
111
star
19

ExercisesInC

Exercises for people learning the C programming language
C
103
star
20

ThinkComplexity

Code for Allen Downey's book Think Complexity, published by O'Reilly Media.
PostScript
96
star
21

AstronomicalData

An introduction to working with astronomical data in Python.
Jupyter Notebook
87
star
22

Swampy

Code for Swampy, a set of modules used in Think Python, first edition
Python
85
star
23

PhysicalModelingInMatlab

Text and code for Physical Modeling in MATLAB
TeX
83
star
24

ProbablyOverthinkingIt

Supplementary material for my book, Probably Overthinking It.
Jupyter Notebook
82
star
25

ThinkPythonItalian

LaTeX source for the Italian Translation of Think Python.
TeX
81
star
26

DataExploration

Supporting code for a video series on best practices for exploratory data analysis.
Python
71
star
27

BayesianDecisionAnalysis

Repository for a workshop on Bayesian Decision Analysis
Jupyter Notebook
64
star
28

ExploratoryDataAnalysis

Repository for an online class on Exploratory Data Analysis in Python
Jupyter Notebook
63
star
29

ThinkJava

LaTeX source for Think Java, 1st edition, by Allen Downey and Chris Mayfield.
TeX
57
star
30

SurvivalAnalysisPython

Explorations of survival analysis in Python
Jupyter Notebook
48
star
31

BayesForUndergrads

Materials for a workshop on developing undergraduate classes on Bayesian statistics.
46
star
32

DataScience

Site for a Data Science class taught by Allen Downey
HTML
42
star
33

ComplexityScience

Repository for a workshop on Complexity Science
Jupyter Notebook
35
star
34

ThinkX

Python
30
star
35

ThinkStats3

Code and LaTeX source for Think Stats, third edition
29
star
36

BayesSeminar

Bayesian statistics seminars
Jupyter Notebook
29
star
37

BayesianInferencePyMC

Workshop on Bayesian inference using PyMC
Jupyter Notebook
26
star
38

ElementsOfDataScienceBook

Repository for the manuscript of Elements of Data Science
TeX
25
star
39

PoliticalAlignmentCaseStudy

Notebooks and data for a case study on political alignment, outlook, and beliefs
Jupyter Notebook
23
star
40

thinkjavasolutions5

Automatically exported from code.google.com/p/thinkjavasolutions
Java
21
star
41

blair-walden-project

The Blair Walden Project: in 1845 Henry David Thoreau went to live in the woods... a year later his journal was found.
19
star
42

Portfolio

Portfolio of Allen Downey at Olin College
HTML
18
star
43

ThinkPythonSolutions

Automatically exported from code.google.com/p/thinkpythonsolutions
Python
17
star
44

DataQnA

Data Q&A: Questions and answers about data and statistics
Jupyter Notebook
17
star
45

ProbablyOverthinkingIt2

New repo for projects related to my blog, Probably Overthinking It.
Jupyter Notebook
16
star
46

MarriageNSFG

Repository for a project using NSFG data to explore marriage patterns in the US.
Stata
15
star
47

clink

A network measurement tool, described at http://allendowney.com/research/clink/
C
12
star
48

RecidivismCaseStudy

Case study on evaluating statistical tools that predict recidivism.
Jupyter Notebook
11
star
49

ModSim

Modeling and Simulation in Python and MATLAB/Octave
Jupyter Notebook
11
star
50

ThinkStats

Notebooks for the third edition of Think Stats
Jupyter Notebook
11
star
51

SignalsAndSystemsAndDynamics

Code and examples for an experimental class on signals, systems, and dynamics
MATLAB
10
star
52

GssReligion

Code and data for measuring and predicting religious affiliation using GSS data.
Jupyter Notebook
10
star
53

GunControlGenerational

Data and analysis related to generational changes in attitudes toward gun control
Jupyter Notebook
9
star
54

ThinkPerl6

Text and supporting code for Think Perl 6 by Laurent Rosenfeld with Allen Downey
TeX
9
star
55

ModSimMatlab

Text and supporting code for Modeling and Simulation.
Makefile
8
star
56

JavaOOP

Supporting code for the OOP in Java independent study
Java
8
star
57

DSIRPSolutions

Solutions to the exercises in Data Structures and Information Retrieval in Python (DSIRP)
Jupyter Notebook
8
star
58

SoftwareSystems

Repo for software related to Software Systems at Olin College.
C
8
star
59

ThinkBayes2Translations

Translations of Think Bayes.
Jupyter Notebook
8
star
60

plastex-oreilly

Branch of plastex that generates DocBook 4.5 that meets O'Reilly style guidelines.
TeX
7
star
61

JupyterAsciidocTemplate

Template for converting Jupyter notebooks to an asciidoc book.
Jupyter Notebook
7
star
62

internet-religion

Data and code for an analysis of Internet use and religious affiliation using data from the GSS.
Python
6
star
63

AtmoChem

Atmospheric chemistry data and analysis
Jupyter Notebook
6
star
64

TheShakes

Jupyter Notebook
5
star
65

complexity

Automatically exported from code.google.com/p/complexity
PostScript
5
star
66

PythonCounterPmf

Examples using Python's Counter collection to implement a probability mass function (PMF)
Jupyter Notebook
5
star
67

FirstLateNSFG

Data and analysis for "Are first babies more likely to be late?"
Jupyter Notebook
4
star
68

PythonFun

Jupyter Notebook
4
star
69

ThinkJavaSequel

Text and supporting code for Think DS: Data Structures in Java, by Allen Downey.
4
star
70

matlabsolutions

Automatically exported from code.google.com/p/matlabsolutions
MATLAB
4
star
71

ThinkOCaml

Automatically exported from code.google.com/p/thinkocaml
PostScript
4
star
72

Notebooks

A repo for iPython notebooks.
4
star
73

ISSPRegression

Exploration of the data from the Crowdsourced Replication Initiative
Makefile
4
star
74

thinkjava5

Automatically exported from code.google.com/p/thinkapjava
TeX
3
star
75

plastex-docbook

DocBook renderer plugin templates and classes for the plasTeX engine
Python
3
star
76

GssExtract

Jupyter Notebook
3
star
77

SoftwareDesign

Directories and unit tests for exercises in Software Design at Olin College.
Python
3
star
78

InspectionParadox

Code and data for an article on length-biased sampling and the inspection paradox
Jupyter Notebook
2
star
79

OlinPyShop

Code for Python workshops from Olin College
2
star
80

TeamAllocation

Code for making team allocations under constraints.
Python
2
star
81

QEACode

Code for Quantitative Engineering Analysis (QEA) class at Olin College
2
star
82

thinkpythonchinese

Automatically exported from code.google.com/p/thinkpythonchinese
TeX
2
star
83

simulating

2
star
84

LongTailedDistributions

Data and code from a series of papers about long-tailed distributions in the Internet.
2
star
85

AfroBarometer

Jupyter Notebook
1
star
86

python-in-hydrology

Automatically exported from code.google.com/p/python-in-hydrology
1
star
87

a-bad-synthesizer

Arduino-based analog-digital synthesizer
Python
1
star
88

2019-08-27-needham

Python
1
star
89

GssFeminism

Exploration of changes in views related to feminism
Jupyter Notebook
1
star