• Stars
    star
    42,364
  • Rank 326 (Top 0.01 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 8 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python Data Science Handbook: full text in Jupyter Notebooks

Python Data Science Handbook

Binder Colab

This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.

cover image

How to Use this Book

About

The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.

The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages. Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A Whirlwind Tour of Python: it's a fast-paced introduction to the Python language aimed at researchers and scientists.

See Index.ipynb for an index of the notebooks available to accompany the text.

Software

The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.

The packages I used to run the code in the book are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:

$ conda install --file requirements.txt

To create a stand-alone environment named PDSH with Python 3.5 and all the required package versions, run the following:

$ conda create -n PDSH python=3.5 --file requirements.txt

You can read more about using conda environments in the Managing Environments section of the conda documentation.

License

Code

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

Text

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.

More Repositories

1

WhirlwindTourOfPython

The Jupyter Notebooks behind my OReilly report, "A Whirlwind Tour of Python"
Jupyter Notebook
3,671
star
2

sklearn_tutorial

Materials for my scikit-learn tutorial
Jupyter Notebook
1,720
star
3

sklearn_pycon2015

Materials for my Pycon 2015 scikit-learn tutorial.
Jupyter Notebook
884
star
4

sklearn_scipy2013

Scikit-learn tutorials for the Scipy 2013 conference
Python
325
star
5

JSAnimation

[DEPRECATED] An IPython notebook-compatible Javascript/HTML viewer for matplotlib animations
Jupyter Notebook
240
star
6

sklearn_pycon2014

Repository containing files for my PyCon 2014 scikit-learn tutorial.
Jupyter Notebook
226
star
7

nfft

Lightweight non-uniform Fast Fourier Transform in Python
Jupyter Notebook
196
star
8

sklearn_pycon2013

Files for my scikit-learn tutorial at PyCon 2013
Jupyter Notebook
173
star
9

wpca

Weighted Principal Component Analysis (PCA) in Python
Jupyter Notebook
141
star
10

2013_fall_ASTR599

Content for my Astronomy 599 Course: Intro to scientific computing in Python
Jupyter Notebook
138
star
11

sklearn_pydata2015

Scikit-Learn Tutorial for PyData Seattle 2015
Python
136
star
12

pySchrodinger

A Python solver for the 1D Schrodinger equation
Python
120
star
13

JupyterWorkflow

Reproducible Data Analysis Workflow in Jupyter
Jupyter Notebook
116
star
14

BayesianAstronomy

Bayesian Methods in Astronomy workshop, presented at AAS227
Jupyter Notebook
111
star
15

ipywidgets-static

[obsolete] Static Widgets for IPython Notebooks
Jupyter Notebook
107
star
16

altair-examples

Some examples of Altair plots
Jupyter Notebook
92
star
17

lpproj

Scikit-learn compatible Locality Preserving Projections in Python
Jupyter Notebook
89
star
18

PythonicPerambulations

Old source for jakevdp.github.io. New source at http://github.com/jakevdp/jakevdp.github.io-source
Jupyter Notebook
86
star
19

mst_clustering

Scikit-learn style estimator for Minimum Spanning Tree Clustering in Python
Jupyter Notebook
84
star
20

supersmoother

Efficient pure Python implementation of Friedman's Supersmoother
Python
83
star
21

jakevdp.github.io-source

Source for my Pythonic Perambulations blog
Jupyter Notebook
82
star
22

2014_fall_ASTR599

Content for my Astronomy 599 / Applied Math 500 Course: Intro to scientific computing in Python
CSS
77
star
23

PracticalLombScargle

Source for my paper, Understanding the Lomb-Scargle Periodogram
Jupyter Notebook
67
star
24

ESAC-stats-2014

Material for my lectures at the ESAC statistics conference, Oct 27-31 2014
Python
65
star
25

nufftpy

Experimenting with pure-Python implementation of the NUFFT
Python
65
star
26

pyCRFsuite

C
61
star
27

cython_template

Package template for a project using Cython
Python
58
star
28

matplotlib_pydata2013

PyData SV 2013 Tutorial on Advanced Matplotlib
Python
57
star
29

pypropack

A python wrapper for the PROPACK library
Fortran
51
star
30

jakevdp.github.io

Pythonic Perambulations website. Source at http://github.com/jakevdp/jakevdp.github.io-source
HTML
40
star
31

travis-python-template

Small template for setting up Travis CI with Python
Python
30
star
32

multiband_LS

Source for our paper on multiband periodograms.
TeX
29
star
33

data-USstates

Collection of CSV data on US states for Pandas merge demos
28
star
34

ProntoData

Scripts to Analyze Pronto's Data Release
25
star
35

klsh

Python implementation of Kernelized Locality Sensitive Hashing
Python
25
star
36

PythonLectures

Various One-off Lectures on Python
Jupyter Notebook
23
star
37

website

My Personal Web Page: http://vanderplas.com
TeX
21
star
38

GitIntro

Git tutorial materials
Jupyter Notebook
21
star
39

pyDistances

Work in progress for eventual contribution to scikit-learn
Python
19
star
40

PyData2014

Materials for my talks at PyData 2014 at Strata, NYC
17
star
41

pyLLE

python wrapper of fast C++ LLE code
C++
17
star
42

OsloWorkshop2014

Material for my lectures at the University of Oslo, Dec 2014
Python
16
star
43

OpenVisConf2014

My Talk for the 2014 OpenVisConf, April 24-25 in Boston, MA
Python
16
star
44

git-intro

Git/Github Intro
13
star
45

marathon-data

Marathon finishing times
10
star
46

siglearn

Tools for machine learning & modeling with noisy data
Python
10
star
47

pyTree

python tree algorithms for nearest neighbor search
Python
9
star
48

SciAmBlogPost

Source materials for my Scientific American blog post
9
star
49

python-vis-landscape

Code for my talk at PyCon 2017
Jupyter Notebook
9
star
50

mpl_tutorial

A basic Matplotlib tutorial
Python
9
star
51

open-recipe-data

Open recipe data used by the Python Data Science Handbook
8
star
52

Thesis

Jake's PhD thesis, completed summer 2012
TeX
8
star
53

memview_benchmarks

Benchmarks of various ways of handling contiguous typed data arrays in cython
Python
8
star
54

jakevdp.github.com

Octopress Blog (replaced by jakevdp.github.io)
JavaScript
8
star
55

kdsphere

KD Trees for Sperical Data
Python
7
star
56

BinaryTree

A cython implementation of a binary search tree (Ball Tree & KD Tree)
Python
7
star
57

colab-snippets

Snippets of useful code in Colaboratory
Jupyter Notebook
7
star
58

GeneratedNotebookExample

An example of a programatically-generated IPython notebook
Jupyter Notebook
7
star
59

supsmu

Wrapper of Friedmann's 1984 Supersmoother Fortran code
Fortran
6
star
60

JupyterWorkflow-source

Materials for my Jupyter Workflow videos
Jupyter Notebook
6
star
61

python-tutorial

Files for my Python tutorials
6
star
62

data-CDCbirths

Historical US birth data culled from the CDC website
5
star
63

lombscargle

Efficient, astropy-compatible implementation of the Lomb-Scargle periodogram
Python
5
star
64

test-notebook-links

Jupyter Notebook
5
star
65

PythonDataScienceReview

Work-in-progress review for WIREs
TeX
5
star
66

nyquist

The Nyquist frequency is probably not what you think it is.
5
star
67

schema-validation

Experiments with validating JSONSchema in Python
Python
4
star
68

try-out-git

Trying git for a git tutorial
Python
4
star
69

talks

Slides, notebooks, and other info from my talks
Jupyter Notebook
4
star
70

hist-review

A Review of Histogram techniques
Python
3
star
71

nfftls

Astropy-style Lomb-Scargle periodograms computed with the NFFT
Python
3
star
72

NIPS2013_sklearn

Abstract for NIPS 2013
3
star
73

AstroAuthorQuery

A simple chrome extension to do quick ADS author queries from the Chrome Omnibox
JavaScript
3
star
74

uw-6.github.io

CSS
2
star
75

jakevdp-git-test

Test repo for a tutorial
Python
2
star
76

FreqBayes

[obsolete] Temporary repository for Frequentism vs Bayesianism writeup
2
star
77

obsdwarf

Measuring modified gravity with Dwarf galaxies
Python
2
star
78

pyOrbits

python orbit integration
Python
2
star
79

bicycle-data

Bicycle dataset used in the Python Data Science Handbook
2
star
80

Stripe82RRLyrae

Searching LSST-pipeline Stripe 82 photometry for RR Lyrae
TeX
2
star
81

ASTR599_homework

Repository for students to submit homework assignments as pull requests
Python
2
star
82

ECML_PKDD_2013

Paper by scikit-learn contributers for ECML/PKDD 2013 conference
TeX
2
star
83

PMLC-2014

Lecture materials for Microsoft Practice of Machine Learning Conference, Oct 23-24 2014
Python
2
star
84

git-test

Testing repo for git tutorial
1
star
85

myproject

Git tutorial project
1
star
86

DSFP-demo

Demo repository for DSFP
Jupyter Notebook
1
star
87

vega_firefox_bug

d
Python
1
star
88

ProbabilisticLensing

An attempt to formulate the weak lensing problem in a probabilistic framework.
Python
1
star
89

LRG_redshifts

Finding double redshifts in SDSS LRGs
Python
1
star
90

spec_data

Cleaning SDSS spectra for use with Mmani
Python
1
star
91

FastTemplatePeriodogram_old

Python
1
star
92

jupyter-demo

Jupyter demo at microsoft
Jupyter Notebook
1
star
93

test

Testing repository from the Python Bootcamp
1
star
94

regper

Experiment with regularized periodograms
Jupyter Notebook
1
star
95

test-cairo

Small repo to test nodejs + cairo on TravisCI
1
star
96

spheredb

Python utilities for storage and manipulation of spherical data in SciDB.
Python
1
star
97

altair2

Just playing around... nothing to see here
Python
1
star
98

conda-recipes

Some conda build recipes
Shell
1
star
99

BombScargle

Test implementation of a robust multi-term Lomb-Scargle variant
TeX
1
star
100

vgparser

(experimental) Python parser for Vega's expression language
Jupyter Notebook
1
star