• Stars
    star
    362
  • Rank 113,943 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 8 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Repository to accompany "Pandas for Everyone"

Pandas for Everyone

Repository to accompany "Pandas for Everyone".

If you have gone through the book, an Amazon review would be much appreciated! My mom would too :)

Setup

The easiest way to get everything you need to the tutorial is to install anaconda

You can download and install it here: https://www.continuum.io/downloads

To download just the data, see the Data section below. Otherwise you can choose to clone this repository, or click the "Clone or Download" link above and clicking Download Zip

Install seaborn for plotting

conda install seaborn

Install all the packages used in the book

There is an error in the preface of the book for installing packages. I am leaving this section here in the README to have an updated list of packages and installation instructions

(Optional) Create a Virtual Environment

You can choose to create a virtual envirionment for the packages used in the book, so it doesn't clash with other packages you plan to use later on.

# create a virtual environment named "book" using python 3.6
conda create -n book python=3.6

# activate the environment
# so all installed packages will go in there and not mess up your base python environment
source activate book

Install the packages

Whether you decited to create a virtual environment or not, you can install the packages with the below commands. If you did use virtual environments, remember to source activate book before you follow along with the book so the packages you installed can be loaded.

conda install pandas xlwt openpyxl seaborn numpy ipython jupyter statsmodels scikit-learn regex wget odo numba
conda install -c conda-forge pweave # you don't really need this package, it was used to build and create the book
conda install -c conda-forge feather-format
pip install lifelines pandas-datareader

Teaching Slides

For those instructors who are using the teaching slide deck version of the book. Each chapter is split into it's own slide deck. There are multiple versions for each chapter.

  1. Jupyter notebook (ipynb)
  2. PDF
  3. HTML

The slides are created using Damian Avila's RISE Jupyter/IPython Slideshow Extension. Thus, you can choose to install the RISE extension and live render and display the Jupyter notebooks (ipynb). Since each chapter is a Jupyter notebook at heart, the conversions to PDF and HTML are performed using

jupyter nbconvert --to slides your_talk.ipynb --post serve

More about useage ange converting to the PDF can be found on the RISE documentation page on useage.

No Powerpoint (.ppt/.odp)

RISE's back end uses reveal.js. Unfortunately there is no way to go from a reveal.js presentation to powerpoint. Having said that, if there's a way we can jerry-rig something together using the the given capabilties of RISE and reveal.js please let me know.

Data

You can choose to just download the datasets by using Minhas Kamal's DownGit by clicking the link here

Ongoing list of data references:

  1. Gapminder: https://github.com/jennybc/gapminder/
  2. Survey: Comes from the Software-Carpentry SQL lesson
  3. Ebola: www.github.com/cmrivers/ebola

Links to teaching sessions

I've taught out of the book while I was writing it. Here you can find the various tutorials and workshops I've taught (pre and post when the book was officially published). You can also checkout my talks page for other things not completely on Pandas.

Tables URL Video
Online Live Training https://github.com/chendaniely/2017-12-04-pandas_live, https://github.com/chendaniely/2018-05-pandas_live, https://github.com/chendaniely/2018-06-pandas_live
Whirlwind tour of Python https://github.com/chendaniely/2017-10-26-python_crash_course
SciPy 2017 Pandas Tutorial https://github.com/chendaniely/scipy-2017-tutorial-pandas https://www.youtube.com/watch?v=oGzU688xCUs
PyData Carolinas 2016 Tutorial https://github.com/chendaniely/2016-pydata-carolinas-pandas https://www.youtube.com/watch?v=dye7rDktJ2E

Other random goodies

More Repositories

1

scipy-2017-tutorial-pandas

SciPy 2017 Pandas Tutorial
Jupyter Notebook
157
star
2

pydatadc_2018-tidy

PyData 2018 tutorial for tidying data
Jupyter Notebook
150
star
3

pyprojroot

Finding project directories in Python (data science) projects, just like in R rprojroot and here packages
Python
115
star
4

scipy-2019-pandas

Pandas tutorial for SciPy 2019
Jupyter Notebook
96
star
5

scipy_2017_notes

Links and notes for SciPy 2017
59
star
6

scipy_2016_notes

52
star
7

2016-pydata-carolinas-pandas

Material for Pandas Tutorial at Pydata Carolinas 2016
Jupyter Notebook
42
star
8

scipy-2020-pandas

Learn Python through Data Processing in Pandas Tutorial
Jupyter Notebook
38
star
9

computational-project-cookie-cutter

A cookie cutter to set up a folder structure for a computational project
Shell
30
star
10

2015-06-22-s2i2-git

http://guyrt.github.io/2015-06-21-s2i2/
Python
29
star
11

2015-06-25-jhu-git

http://chendaniely.github.io/2015-06-25-jhu/
Python
21
star
12

2021-07-13-scipy-pandas

Learn Python Through Data Processing in Pandas
Jupyter Notebook
18
star
13

animal_crossing

Tom Nook is a loan shark.
HTML
16
star
14

obsidian-templates

templates I use for obsidian and zettelkasten
15
star
15

pycon_2019-pandas_tutorial

PyCon 2019 Pandas Tutorial
Jupyter Notebook
15
star
16

2020-08-26-rstudio_debunk

Debunking the R vs. Python Myth
R
12
star
17

ds4biomed

Data Science for the Biomedical Sciences
R
12
star
18

2016-pydata-dc-python_useRs

Python useRs talk for PyData DC 2016
HTML
9
star
19

2015-04-25-rstatsnyc-ebola

HTML
7
star
20

2017-06-26-meetup-r_you_markDown

HTML
7
star
21

pybay_2019-pandas_tutorial

PyBay 2019 Pandas Tutorial
Jupyter Notebook
6
star
22

rstatsdc_2018-structure

My talk (and code) for #rstatsdc 2018 conference
HTML
6
star
23

chendaniely.github.io

Perpetually under construction
HTML
6
star
24

2017-10-26-python_crash_course

Jupyter Notebook
5
star
25

2016-07-sismid

Notes, code, and slides from SISMID 2016
HTML
5
star
26

2022-01-04-python

Jupyter Notebook
4
star
27

2019-07-29-python_live

Jupyter Notebook
4
star
28

2020-02-python

Jupyter Notebook
4
star
29

2023-01-04-python

Jupyter Notebook
4
star
30

animalcrossing

R package for animal crossing data
R
4
star
31

dissertation-analysis

HTML
3
star
32

odsc-east-2020-plumber_docker

R
3
star
33

2022-06-29-python

Jupyter Notebook
3
star
34

2015-04-15-SPDC-shiny

Shiny Tutorial for SPDC
HTML
3
star
35

rstatsdc_2019-python-r

Talk for DCR conference on using Python within the R ecosystem with reticulate
HTML
3
star
36

bigbookofpython

HTML
3
star
37

multi-agent-neural-network

Multi Agent Neural Network (MANN)
Python
3
star
38

odsc-east-2020-forecasting_timeseries

3
star
39

2018-06-pandas_live

Jupyter Notebook
3
star
40

rstatsdc-2020-tidyeval

R
3
star
41

sklearn-crash_course

Jupyter Notebook
3
star
42

odsc-east-2020-intro_r

HTML
3
star
43

pydata-nyc-2022-python_quarto

Install Python, Quarto All the Things!
JavaScript
2
star
44

rstatsnyc-2021-learner_personas

HTML
2
star
45

2017-12-04-pandas_live

Jupyter Notebook
2
star
46

2015-11-23-harvard

http://chendaniely.github.io/2015-11-23-harvard/
Python
2
star
47

multidisciplinary-diffusion-model-experiments

Jupyter Notebook
2
star
48

rstatsnyc-2022-analysis_presentations

HTML
2
star
49

2015-11-23-harvard-git

Git lesson at Harvard: chendaniely.github.io/2015-11-23-harvard
2
star
50

2022-03-09-git_collab

2
star
51

cdcepi-zika-data_only

cdcepi zika data repository containing only the data files
R
2
star
52

python_r_demos

I don't know what to name this repo
Jupyter Notebook
2
star
53

git_book

Book about using Git
Jupyter Notebook
2
star
54

2021-09-01-git_dan

2
star
55

rstatsnyc_2019-workflow

Talk for the NYR conference on workflows
HTML
2
star
56

2018-05-pandas_live

Jupyter Notebook
2
star
57

rstatsnyc_2017-data_scientist

So you want to be a data scientist?
HTML
2
star
58

wicer-entice3

An Interdisciplinary Communication Tool to Support the Process of Generating Tailored Infographics From Electronic Health Data Using EnTICE3
R
2
star
59

2020-08-12-nyr_git-dan

NYR 2020 Git workshop
R
1
star
60

data_science-figure

Rebol
1
star
61

2023-05-01-git

1
star
62

pydatadc18_lightning

Properties at PyData DC 2018
Jupyter Notebook
1
star
63

aur

Scripts to automate AUR things
R
1
star
64

2020-04-27-git-dan

1
star
65

2022-09-30-yvr_datafest

Jupyter Notebook
1
star
66

dspg18_training

The stuff I typed in class
1
star
67

web-of-science

Python
1
star
68

docker-ohdsi-omop-synpuf

A docker container that implemets the OMOP common data model with the SYNPUF open medicare dataset pre-loaded
1
star
69

zika_dashboard_cdc

Shiny Dashboard for CDC Zika data
R
1
star
70

dirty

Dirties data in R
R
1
star
71

2020-02-26-git_collaboration

Git training for collaboration
1
star
72

asa-data_jamboree-2022

1
star
73

2016-scipy-lightning-pandas_for_everyone

Lightening talk for Pandas for Everyone
Jupyter Notebook
1
star
74

git-book

CSS
1
star
75

2019-11-11-git_collaboration-dan

Git Collaboration LiveLesson
1
star
76

deepsaber

Have a computer create beatsaber mappings for songs
Python
1
star
77

fluent_python

Coding through the book
Jupyter Notebook
1
star
78

2021-11-02-git_basics

1
star
79

citeseerx-citation-network

Methods to get citation information from CiteSeerX
Python
1
star
80

ansible_r_pkg_cran

Ansible module to install R packages from CRAN
1
star
81

2020-10-06-git-dan

Git basics workshop
1
star
82

2020-04-28-git_collaboration-dan

Git collaboration workshop
1
star
83

rstatsnyc_2020-learnr_gradethis

NYR Conference 2020
HTML
1
star
84

elemental_maths

Notes from reading Elements of Statistical Learning
TeX
1
star
85

dissertation-phase4_exercises

R
1
star
86

dissertation-fun

R
1
star
87

2020-09-09-CCatHome-git-dan

Carpentry Con at Home Git workshop Part 2
1
star
88

jupyter-book-demo

Jupyter Notebook
1
star
89

libloadR

R package that handles loading and installing multiple libraries from an R script file
R
1
star
90

2023-12-12-python

Jupyter Notebook
1
star
91

dissertation-prelim

Prelims a la R01
TeX
1
star
92

2018-10-python2

Jupyter Notebook
1
star
93

ncb-2019-python

Materials for Python talk at the 2019 Nonclinical Biostatistics Conference
Jupyter Notebook
1
star
94

2017-08-15-meetup-r_you_markDown

HTML
1
star
95

2023-01-23-demo-gha-test

1
star
96

yvr-ubc-notes

1
star
97

2022-03-08-git_basics

1
star
98

2023-01-20-merge_conflict_2

Jupyter Notebook
1
star
99

geoscrapeR

R package that scrapes and contains geo data
R
1
star
100

2022-05-31-dsci100-git-demo

Our first DSCI100 Git Repository
1
star