• Stars
    star
    257
  • Rank 158,728 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 8 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Full Stack Data Science in Python

Full Stack Data Science


"Jack of all trades, master of none, though oft times better than master of one."

One of the common pain points that we have come across in big organizations is the last-mile delivery of data science applications.

You code, you test, you ship and you maintain

One common delivery vehicle is to create dashboards(BI). But the one, that's very useful and neglected more often than not, is to create APIs and provide seamless integration with other applications within the company. This requires you to have a basic understanding of machine learning, server-side programming and front-end application.

In this workshop, you would learn how to build a seamless end-to-end data driven application - Data Exploration, Machine Learning Model, RESTful API and Web Application - to solve a business prediction problem.

Course Content

  1. Introduction to Data Science Process
  2. Introduction to Data Exploration
  3. Introduction to Machine Learning
  4. Overview of the case we will be solving in the workshop
  5. A simple ML Model
  6. Creating RESTful API
  7. Persisting model output
  8. Updating the model as more data comes in (batch only - no streaming)
  9. A simple webpage front-end to visualise the results and interact with the API.
  10. Creating a simple application that accomplishes this end-to-end

An advanced version of the workshop, taught over two days, will cover the following additional topics

  1. Building data pipeline and models
  2. Deployment on cloud
  3. Automate the workflow (eg: using airflow)

Target Audience

  • A programmer but not a data science practioner: A programmer with experience in server-side or front-end development and maybe has some familiarity with doing data analysis. You could be looking to transition in to building data driven products or a create a richer product experience with data.
  • A data science practioner but not a programmer: A data science with some experience in doing data analysis, preferably in a scripting language (R/Python/Scala), but wants to get a deeper and a more applied perspective on creating data driven products.

Pre-requisites

  • Programming knowledge is mandatory. Attendee should be able to write conditional statements, use loops, be comfortable writing functions and be able to understand code snippets and come up with programming logic.
  • Participants should have a basic familiarity of Python. Specifically, we expect participants to know the first four sections from this: http://anandology.com/python-practice-book/
  • Participants should also have some experience with using Python for Data Science. Specifically, participants should be able to work with the following python libraries
    • jupyter: For doing literate programming in notebooks
    • numpy: For scientific computation
    • pandas: For data wrangling and transformation of tabular data (dataframes)
    • scikit-learn: For building machine learning models

Software Requirements

We will be using Python data stack for the workshop. Please install Ananconda for Python 3.5 or 3.6 for the workshop. Additional requirement will be communicated to participants.

Install the required packages using conda.

conda install numpy pandas matplotlib seaborn scikit-learn pydotplus flask flask-wtf
conda install -c ioam holoviews bokeh

We'll also need a python library firefly-python that is not available as conda package. Install it using pip.

pip install firefly-python rorolite

Facilitatorsโ€™ Profile

Anand Chitipothu has been crafting beautiful software since a decade and half. He's now building a data science platform, rorodata, which he recently co-founded. He regularly conducts advanced programming courses through Pipal Academy. He is co-author of web.py, a micro web framework in Python. He has worked at Strand Life Sciences and Internet Archive. You can tweet him at @anandology.

Amit Kapoor teaches the craft of telling visual stories with data. He conducts workshops and trainings on Data Science in Python and R, as well as on Data Visualisation topics. His background is in strategy consulting having worked with AT Kearney in India, then with Booz & Company in Europe and more recently for startups in Bangalore. He did his B.Tech in Mechanical Engineering from IIT, Delhi and PGDM (MBA) from IIM, Ahmedabad. You can find more about him at http://amitkaps.com/ and tweet him at @amitkaps.

Bargava Subramanian is a practicing Data Scientist. He has 14 years of experience delivering business analytics solutions to Investment Banks, Entertainment Studios and High-Tech companies. He has given talks and conducted workshops on Data Science, Machine Learning, Deep Learning and Optimization in Python and R. He has a Masters in Statistics from University of Maryland, College Park, USA. He is an ardent NBA fan. You can tweet to him at @bargava.

More Repositories

1

hackermath

Introduction to Statistics and Basics of Mathematics for Data Science - The Hacker's Way
Jupyter Notebook
1,443
star
2

visdown

Visualisation Markdown
JavaScript
659
star
3

recommendation

Recommendation System using ML and DL
Jupyter Notebook
450
star
4

weed

Analysing Weed Pricing across US - Data Analysis Workshop
HTML
128
star
5

deep-learning

Deep Learning Bootcamp
Jupyter Notebook
62
star
6

applied-machine-learning

Applied Machine Learning @ http://amitkaps.com/ml
Jupyter Notebook
37
star
7

art-data-science

The Art of Data Science
HTML
34
star
8

text-mining

Text Mining in Python
Jupyter Notebook
23
star
9

machine-learning

Workshop on Machine Learning in Python
HTML
19
star
10

multidim

Visualising Multi Dimensional Data
Jupyter Notebook
18
star
11

datascience

Build and Deploy Machine Learning Models on the Cloud
Jupyter Notebook
17
star
12

modelvis-talks

Model Visualisation.
Jupyter Notebook
16
star
13

pandas-workshop

Introduction to data analysis using Pandas
Jupyter Notebook
13
star
14

ensemble

Ensemble Approach for Machine Learning
Jupyter Notebook
8
star
15

recoflow

Recommender System for Humans
Python
7
star
16

learn-d3

Learning d3.js for data visualisation
HTML
5
star
17

djembeviz

Visualising Djembe to Learn Music.
JavaScript
5
star
18

DataSciencePython

Introduction to Data Science in Python
Jupyter Notebook
3
star
19

dsVis

Data Visualisation for Data Science
Jupyter Notebook
3
star
20

proposals

Proposal submissions for Talks and Tutorials at Conferences
3
star
21

data-vis-workshop

Data Visualisation Workshop
HTML
2
star
22

modelvis

Model Visualisation
Python
2
star
23

trees

Tree-based Model [Random Forest and Gradient Boosting]
Jupyter Notebook
2
star
24

beats1

Visualising Radio Plays by Beats1
JavaScript
2
star
25

mlops

Machine Learning Operations
1
star
26

visual-analytics

Visual Analytics and Data Visualisation
1
star
27

artistry

Generative Visualisation
JavaScript
1
star
28

svm

Support Vector Machines
Jupyter Notebook
1
star
29

interactive

Interactive Data Visualisation
JavaScript
1
star
30

data-vis-python

Data Visualisation in Python
Jupyter Notebook
1
star
31

onion

Visualising Onion Price in India
HTML
1
star
32

deep-learning-rorodata

Get started with deep learning workshop @ rorodata
1
star
33

cars

Visualising Cars in India
1
star
34

onions-dataset

Onions Price Dataset in India
HTML
1
star
35

workshop-av-2018

Analytics Vidhya 2018 - Applied Machine Learning
Jupyter Notebook
1
star