• Stars
    star
    1,016
  • Rank 45,015 (Top 0.9 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 9 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyCon 2015 Pandas tutorial materials

Welcome to Brandon’s Pandas Tutorial

The first instance of this tutorial was delivered at PyCon 2015 in Montréal, but I hope that many other people will be able to benefit from it over the next few years — both on occasions on which I myself get to deliver it, and also when other instructors are able to do so.

If you want to follow along with the tutorial at home, here is the YouTube recording of the 3-hour tutorial at PyCon itself:

Watch the video tutorial on YouTube

https://www.youtube.com/watch?v=5JnMutdy6Fw

To make it useful to as many people as possible, I hereby release it under the MIT license (see the accompanying LICENSE.txt file) and I have tried to make sure that this repository contains all of the scripts needed to download and set up the data set that we used.

Quick Start

If you have both conda and git on your system (otherwise, read the next section for more detailed instructions):

$ conda install --yes jupyter matplotlib pandas
$ git clone https://github.com/brandon-rhodes/pycon-pandas-tutorial.git
$ cd pycon-pandas-tutorial
$ build/BUILD.sh
$ ipython notebook

Detailed Instructions

You will need Pandas, the IPython Notebook, and Matplotlib installed before you can successfully run the tutorial notebooks. The Anaconda Distribution is a great way to get up and running quickly without having to install them each separately — running the conda command shown above will install all three.

Note that having git is not necessary for getting the materials. Simply click the “Download ZIP” button over on the right-hand side of this repository’s front page at the following link, and its files will be delivered to you as a ZIP archive:

https://github.com/brandon-rhodes/pycon-pandas-tutorial

Once you have unpacked the ZIP file, download the following four IMDB data files and place them in the tutorial’s build directory:

  • ftp://ftp.fu-berlin.de/misc/movies/database/frozendata/actors.list.gz
  • ftp://ftp.fu-berlin.de/misc/movies/database/frozendata/actresses.list.gz
  • ftp://ftp.fu-berlin.de/misc/movies/database/frozendata/genres.list.gz
  • ftp://ftp.fu-berlin.de/misc/movies/database/frozendata/release-dates.list.gz

If the above links don’t work for you, try these alternate sources of the same files:

  • ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/frozendata/actors.list.gz
  • ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/frozendata/actresses.list.gz
  • ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/frozendata/genres.list.gz
  • ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/frozendata/release-dates.list.gz

To convert these into the CSV files that the tutorial needs, run the BUILD.py script with either Python 2 or Python 3. It will create the three CSV files in the data directory that you need to run all of the tutorial examples. It should take about 5 minutes to run on a fast modern machine:

$ python build/BUILD.py

You can then start up the IPython Notebook and start looking at the notebooks:

$ ipython notebook

I hope that the recording and the exercises in this repository prove useful if you are interested in learning more about Python and its data analysis capabilities!

Brandon Rhodes

More Repositories

1

fopnp

Foundations of Python Network Programming (Apress) — scripts and examples
Python
1,356
star
2

python-patterns

Source code behind the python-patterns.guide site by Brandon Rhodes
Python
1,256
star
3

pyephem

Scientific-grade astronomy routines for Python
C
737
star
4

python-sgp4

Python version of the SGP4 satellite position library
Python
345
star
5

logging_tree

Debug Python logging problems by printing out the tree of handlers you have defined.
Python
302
star
6

Concentric-CSS

A standard order for CSS properties that starts at the outer edge of the box model and moves inward
CSS
296
star
7

homedir

My home directory dotfiles and customizations
Perl
201
star
8

python-typesetting

A Python library letting you invoke TeX-inspired typesetting algorithms
Python
156
star
9

sphinx-tutorial

Exercises for the Sphinx Tutorial that I used to present each year at PyCon
Python
154
star
10

python-adventure

Original Colossal Caves adventure game, but in Python 3
Python
148
star
11

python-jplephem

Python version of NASA DE4xx ephemerides, the basis for the Astronomical Alamanac
Python
103
star
12

unixpc-font-bdf

Convert the venerable AT&T UNIX PC font to the modern BDF bitmap format
Python
75
star
13

uncommitted

Command-line tool to find projects whose changes have not been committed to version control
Python
74
star
14

luca

Bookkeeping in Python
Python
67
star
15

scrawler

ASCII art animation library for Python
Python
64
star
16

pycon-sql-tutorial

Resources for my PyCon SQL tutorial
Python
52
star
17

python-novas

The United States Naval Observatory NOVAS astronomy library for Python
C
42
star
18

exe-from-python

Python
34
star
19

assay

Attempt to write a Python testing framework I can actually stand
Python
24
star
20

pycon2010-mighty-dictionary

Slides, dict introspection, and dict diagramming tools for my PyCon 2010 Talk “The Mighty Dictionary”
HTML
22
star
21

pyzmq-static

Script to compile a statically-linked version of the Python "zmq" package with ØMQ and libuuid (on Linux systems) built-in instead of relying on shared libraries.
C
22
star
22

dot-emacs

My self-booting .emacs.d directory, mostly for Python and JavaScript development
Emacs Lisp
18
star
23

contingent

The Contingent library for powering a build system with dynamic rules
HTML
11
star
24

gps-to-html

Static site generator that turns GPS tracks into maps and statistics
Python
7
star
25

backpacking-planner

Simple Python script to tally mileages between campgrounds and water sources
Python
7
star
26

blog

Source code for my personal blog
Jupyter Notebook
6
star
27

publicanus

Prototype small business tax calculation for Django Dash 2012
Python
6
star
28

pywinner

Set up a Windows EC2 instance to build Python packages
Python
5
star
29

personal-search-engine

Local and searchable cache of favorite tweets and bookmarked web pages
Python
4
star
30

conda-install

Install Conda packages right from your requirements.txt file
Python
3
star
31

assert_rewriter

Rewrite assert statements in Python bytecode so they support introspection
Python
3
star
32

trace-memory-access

Scripts to trace memory accesses from a program
Python
3
star
33

python-johnhancock

Sign a PDF document with a signature.png file
Python
2
star
34

twiddler-1-driver

Linux driver for original Handykey Twiddler chording keyboard
C
2
star
35

brandon-garmin

Simple Garmin apps for backcountry navigation
Monkey C
2
star
36

website-dillers-flooring

A web site for Diller's Flooring in Bluffton, Ohio
JavaScript
2
star
37

backports

2
star
38

python-spk

A library for parsing and loading SPK files
Python
2
star
39

cmr

cmr
2
star
40

pycon2010-sphinx-tutorial

Python
1
star
41

recursive-family-search

Find out when relatives reached this continent
Python
1
star
42

org-mode-invoicer

Python program to write up an invoice using org-mode
Python
1
star
43

build-butchart-map-index

Scripts to build a map index on my web site
HTML
1
star
44

talk-about-luca

Sample code for my May 2016 Clepy talk
Jupyter Notebook
1
star
45

rhodesmill.org

Personal website, for GitHub Pages
HTML
1
star