• Stars
    star
    194
  • Rank 200,219 (Top 4 %)
  • Language
    Jupyter Notebook
  • Created over 8 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Course materials for my data pipeline video course with O'Reilly

Data Pipelines with Python (video edition)

Welcome to the code repository for Data Pipelines with Python! If you have any questions reach out to @kjam on Twitter or GitHub.

Code Structure

Most of the code covered in the videos is here; but not all of it. I highly recommend you take time to type out all the code along with the videos and simply use these scripts to "double check" or remind yourself of the work you've already completed.

Installation

Install with the requirements.txt file.

pip install -r requirements.txt

Yahoo Finance API

There is a good writeup in German for the Finance API which I used as a starting point to download newer-time data.

Python2 v. Python3

This repository is primarily compliant for both versions. Please let me know if you run into any bugs!

Ansible Playbook

To use as a template rather than as a direct template, I've included a working playbook in the deploy folder. If you try and run it directly, you will likely receive some errors. Please read through the notebook and take a look at the directives and determine which you need and which you don't. It also requires a .ssh/authorized_hosts file as well as a config file located in celeryapp/config/prod.cfg. If you run into other errors, I highly recommend reading through the Ansible documentation or searching on StackOverflow.

Corrections?

If you find any issues in these code examples, feel free to submit an Issue or Pull Request. I appreciate your input!

Questions?

Reach out to @kjam on Twitter or GitHub. @kjam is also often on freenode. :)

More Repositories

1

data-cleaning-101

Data Cleaning Libraries with Python
Jupyter Notebook
279
star
2

python-web-scraping-tutorial

A Python-based web and data scraping tutorial
Python
210
star
3

wswp

Code for the second edition Web Scraping with Python book by Packt Publications
Python
130
star
4

data-wrangling-pycon

An Introduction to Data Wrangling with Python
Jupyter Notebook
81
star
5

practical-data-privacy

Practical Data Privacy
Jupyter Notebook
70
star
6

python_flight_search

Using Python to search for flights.
Python
54
star
7

datafuzz

A data science Python library aimed at adding fuzz, noise and other issues to your data for testing purposes.
Python
30
star
8

data-wrangling-video

Code and examples for O'Reilly's Data Wrangling with Python video course
Jupyter Notebook
28
star
9

intro-to-ml

A basic introduction to machine learning (one day training).
Jupyter Notebook
16
star
10

random_hackery

Just little bits.
Jupyter Notebook
10
star
11

europarl_scraper

European Parliament website Python scraper
Jupyter Notebook
9
star
12

uf-data-mining-and-analysis

University of Florida Data Mining and Analysis
Jupyter Notebook
8
star
13

web-scraping-speed-comparison

A Python web scraping speed comparison
Python
6
star
14

uf-intro-to-programming

University of Florida Audience Analytics Introduction to Programming with Data course
HTML
6
star
15

cherrypy-poll

Polling with cherrypy: A beginner's project guide to python programming
Python
6
star
16

kjam-datalab-notebooks

Some Example Jupyter Notebooks using Google's DataLab
4
star
17

cron-parser

Python script that allows you to easily update a server cron that has many different projects without overwriting other crons.
Python
1
star
18

chatbot_scraper

Python scraper(s) for chatbot logs. Currently supports botbot.me logs.
Python
1
star