• Stars
    star
    120
  • Rank 286,392 (Top 6 %)
  • Language
    HTML
  • License
    MIT License
  • Created over 5 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Intermediate Machine Learning with Scikit-learn, 4h interactive workshop

Intermediate Machine learning with scikit-learn

Part 2 of 4

Other parts:

Content

Instructor


This repository will contain the teaching material and other info associated with the "Intermediate Machine Learning with scikit-learn" course.

About the workshop

Scikit-learn is a machine learning library in Python, that has become a valuable tool for many data science practitioners. This workshop will go beyond the basics and show how to effectively evaluate and tune algorithms. We will also discuss the most important machine learning algorithms that you're likely to see in practice, how and when to use them, and some details about how they work internally. The session will focus on linear models for classification and regression and tree-based models, including random forests.

Prerequisites

This workshop assumes familiarity with Jupyter notebooks and basics of pandas, matplotlib and numpy. It also assumes familiarity with the basics of supervised learning, like training and test data and basics of model evaluation. You should have build a model with scikit-learn (or attend Introduction to Machine learning with scikit-learn) before taking this workshop.

Obtaining the Tutorial Material

If you are familiar with git, it is most convenient if you clone the GitHub repository. This is highly encouraged as it allows you to easily synchronize any changes to the material.

git clone https://github.com/amueller/ml-workshop-2-of-4.git

If you are not familiar with git, you can download the repository as a .zip file by heading over to the GitHub repository (https://github.com/amueller/ml-workshop-2-of-4) in your browser and click the green โ€œDownloadโ€ button in the upper right.

Please note that I may add and improve the material until shortly before the tutorial session, and we recommend you to update your copy of the materials one day before the tutorials. If you have an GitHub account and forked/cloned the repository via GitHub, you can sync your existing fork with via the following commands:

git pull origin master

Installation Notes

This tutorial will require recent installations of

The last one is important, you should be able to type:

jupyter notebook

in your terminal window and see the notebook panel load in your web browser. Try opening and running a notebook from the material to see check that it works.

For users who do not yet have these packages installed, a relatively painless way to install all the requirements is to use a Python distribution such as Anaconda, which includes the most relevant Python packages for science, math, engineering, and data analysis; Anaconda can be downloaded and installed for free including commercial use and redistribution. The code examples in this tutorial requires Python 3.5 or later.

After obtaining the material, we strongly recommend you to open and execute a Jupyter Notebook jupter notebook check_env.ipynb that is located at the top level of this repository. Inside the repository, you can open the notebook by executing

jupyter notebook check_env.ipynb

inside this repository. Inside the Notebook, you can run the code cell by clicking on the "Run Cells" button as illustrated in the figure below:

Finally, if your environment satisfies the requirements for the tutorials, the executed code cell will produce an output message as shown below:

More Repositories

1

word_cloud

A little word cloud generator in Python
Python
9,936
star
2

introduction_to_ml_with_python

Notebooks and code for the book "Introduction to Machine Learning with Python"
Jupyter Notebook
7,146
star
3

scipy_2015_sklearn_tutorial

Scikit-Learn tutorial material for Scipy 2015
Python
578
star
4

scipy-2016-sklearn

Scikit-learn tutorial at SciPy2016
Jupyter Notebook
516
star
5

COMS4995-s19

COMS W4995 Applied Machine Learning - Spring 19
Jupyter Notebook
304
star
6

ml-workshop-1-of-4

Introduction to Machine learning with Python, 4h interactive workshop
HTML
296
star
7

scipy-2017-sklearn

Scipy 2017 scikit-learn tutorial by Alex Gramfort and Andreas Mueller
Jupyter Notebook
282
star
8

scipy-2018-sklearn

Scipy 2018 scikit-learn tutorial by Guillaume Lemaitre and Andreas Mueller
Jupyter Notebook
247
star
9

COMS4995-s20

COMS W4995 Applied Machine Learning - Spring 20
Jupyter Notebook
242
star
10

mglearn

mglearn helper package for "Introduction to Machine Learning with Python"
Python
227
star
11

ml-training-intro

Materials for the "Introduction to Machine Learning" class
HTML
225
star
12

ml-training-advanced

Materials for the "Advanced Scikit-learn" class in the afternoon
Jupyter Notebook
161
star
13

COMS4995-s18

COMS W4995 Applied Machine Learning - Spring 18
Jupyter Notebook
158
star
14

ml-workshop-4-of-4

Advanced Machine Learning with Scikit-learn part II
HTML
157
star
15

kaggle_insults

Kaggle Submission for "Detecting Insults in Social Commentary"
Python
153
star
16

ml-workshop-3-of-4

Advanced Machine Learning with Scikit-learn part I
HTML
133
star
17

gco_python

Python wrappers for GCO alpha-expansion and alpha-beta-swaps
Python
131
star
18

futurepast

Deprecation tools for Python
Python
119
star
19

advanced_training

Advanced Scikit-learn training session
Jupyter Notebook
119
star
20

applied_ml_spring_2017

Website and material for the FIXME course on Practical Machine Learning
Jupyter Notebook
89
star
21

talks_odt

Slides and materials for most of my talks by year
Jupyter Notebook
89
star
22

odscon-2015

Slides and material for open data science
80
star
23

odscon-sf-2015

Material for ODSCON San Francisco 2015
Jupyter Notebook
79
star
24

quick-ml-intro

One hour interactive training for ML with scikit-learn
Jupyter Notebook
74
star
25

aml

Applied Machine Learning with Python
Jupyter Notebook
73
star
26

pydata-nyc-advanced-sklearn

Notebooks (and slides) for my PyData NYC 2014 tutorial on the more advanced features of scikit-learn.
69
star
27

sklearn_tutorial

Slides for quick intro to machine learning with sklearn
CSS
65
star
28

sklearn-one-day

One day workshop for machine learning with scikit-learn
HTML
62
star
29

segmentation

Superpixel based semantic segmentation
Python
53
star
30

pydata-strata-2015

Slides and notebooks for PyData Strata San Jose
51
star
31

scikit-learn-interactive-tutorial

IPython notebooks and data an interactive scikit-learn tutorial.
51
star
32

patsylearn

Patsy Adaptors for Scikit-learn
Python
49
star
33

advanced_git_nyu_2016

Advanced git and github course material
HTML
39
star
34

textonboost

Texton boost implementation in C++ by Philipp Kraehenbuehl
C++
32
star
35

pydata-amsterdam-2016

Machine Learning with Scikit-Learn (material for pydata Amsterdam 2016)
Jupyter Notebook
31
star
36

ml_meetup_nyc_2016

Material for Machine Learning Meetup "Machine Learning with Scikit-learn"
Jupyter Notebook
29
star
37

odsc_east_2016

Jupyter Notebook
26
star
38

speed_reading

Speed reading app with running focus
CSS
25
star
39

slic-python

SLIC wrapper for Python - legacy, rather use scikit-image now!
C++
23
star
40

ml-workshop-short

Two hour interactive machine learning workshop
HTML
22
star
41

mlss_2015

Material for open source machine learning practical
Python
21
star
42

jupytercon2017

Material for Data analysis and machine learning in Jupyter
Jupyter Notebook
21
star
43

structured-prediction-workshop

Introduction to structured prediction with Python and pystruct
TeX
18
star
44

information-theoretic-mst

Information Theoretic Clustering using Minimum Spanning Trees
Python
18
star
45

advanced-sklearn-boston-nlp-2016

Material and slides for Boston NLP meetup May 23rd 2016
Jupyter Notebook
17
star
46

nyu_ml_lectures

Materials for NYU Machine Learning Guest Lectures
Python
17
star
47

amueller.github.io

Less
17
star
48

ImageNet-parsing-Python

Python class to explore the ImageNet database
Python
16
star
49

water_hackweek_2020_machine_learning

Water Hackweek Machine Learning workshop
Jupyter Notebook
15
star
50

strata-nyc-2016

Materials fort Strata NYC 2016 scikit-learn tutorial
Jupyter Notebook
15
star
51

damascene-python-and-matlab-bindings

Python and matlab bindings for the Damascene CUDA implementation of gPB
C++
13
star
52

git_workshop

Material for git workshop
HTML
11
star
53

strata_singapore_2015

Materials for Strata Singapore "Machine learning In Python with scikit-learn" tutorial.
Jupyter Notebook
9
star
54

sklearn_workshop

Jupyter notebooks for interactive scikit-learn workshop
Python
8
star
55

datasets

Datasets of some standard computer vision / deep learning benchmarks
Python
7
star
56

cv

Curriculum Vitae
TeX
7
star
57

GPU-Quickshift-Python-Bindings

Python bindings for Brian Fultersons really quick shift
C++
7
star
58

structured_prediction_talk

Slides for explaining structured prediction and PyStruct
TeX
6
star
59

oss-directions-webinar-2019

Open Source Directions webinar materials
Jupyter Notebook
6
star
60

intro_to_ml_cuny_2015

Introduction to machine learning for CUNY
5
star
61

columbia-website

My official columbia page
CSS
5
star
62

phd-thesis-segmentation

unearthing my thesis - this is a backup
TeX
4
star
63

figures

Some figures and drawings for talks
3
star
64

daimrf

Python interface for inference with LibDAI
Python
2
star
65

notebooks

Random notebooks
2
star
66

vim-config

Vim Script
2
star
67

nsf-biosketch

stand-alone nsf biosketch
TeX
2
star
68

oss_workshop

Demo repository for oss workshop
1
star
69

dask-learn

Python
1
star
70

gah

Code I don't want to keep reimplementing all the time
1
star
71

CZI-sklearn

TeX
1
star
72

dotfiles

Another try to manage my dotfiles
Shell
1
star
73

icra_2014_crf_nyu

ICRA 2014 paper on crfs for semantic segmenation on the nyu dataset
TeX
1
star