• Stars
    star
    303
  • Rank 137,655 (Top 3 %)
  • Language
    HTML
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Introduction to Machine learning with Python, 4h interactive workshop

Introduction to Machine learning with scikit-learn

Part 1 of 4

Other parts:

Content

Instructor


This repository will contain the teaching material and other info associated with the "Introduction to Machine Learning with scikit-learn" course.

About the workshop

Machine learning has become an indispensable tool across many areas of research and commercial applications. From text-to-speech for your phone to detecting the Higgs boson, machine learning excels at extracting knowledge from large amounts of data. This talk will give a general introduction to machine learning, as well as introduce practical tools for you to apply machine learning in your research. We will focus on one particularly important subfield of machine learning, supervised learning. The goal of supervised learning is to "learn" a function that maps inputs x to an output y, by using a collection of training data consisting of input-output pairs. We will walk through formulating a problem as a supervised machine learning problem, creating the necessary training data and applying and evaluating a machine learning algorithm. This workshop should give you all the necessary background to start using machine learning yourself.

Prerequisites

This workshop assumes familiarity with Jupyter notebooks and basics of pandas, matplotlib and numpy.

Obtaining the Tutorial Material

If you are familiar with git, it is most convenient if you clone the GitHub repository. This is highly encouraged as it allows you to easily synchronize any changes to the material.

git clone https://github.com/amueller/ml-workshop-1-of-4.git

If you are not familiar with git, you can download the repository as a .zip file by heading over to the GitHub repository (https://github.com/amueller/ml-workshop-1-of-4) in your browser and click the green “Download” button in the upper right.

Please note that I may add and improve the material until shortly before the tutorial session, and we recommend you to update your copy of the materials one day before the tutorials. If you have an GitHub account and forked/cloned the repository via GitHub, you can sync your existing fork with via the following commands:

git pull origin master

Installation Notes

This tutorial will require recent installations of

The last one is important, you should be able to type:

jupyter notebook

in your terminal window and see the notebook panel load in your web browser. Try opening and running a notebook from the material to see check that it works.

For users who do not yet have these packages installed, a relatively painless way to install all the requirements is to use a Python distribution such as Anaconda, which includes the most relevant Python packages for science, math, engineering, and data analysis; Anaconda can be downloaded and installed for free including commercial use and redistribution. The code examples in this tutorial requires Python 3.5 or later.

After obtaining the material, we strongly recommend you to open and execute a Jupyter Notebook jupter notebook check_env.ipynb that is located at the top level of this repository. Inside the repository, you can open the notebook by executing

jupyter notebook check_env.ipynb

inside this repository. Inside the Notebook, you can run the code cell by clicking on the "Run Cells" button as illustrated in the figure below:

Finally, if your environment satisfies the requirements for the tutorials, the executed code cell will produce an output message as shown below:

More Repositories

1

word_cloud

A little word cloud generator in Python
Python
10,101
star
2

introduction_to_ml_with_python

Notebooks and code for the book "Introduction to Machine Learning with Python"
Jupyter Notebook
7,348
star
3

scipy_2015_sklearn_tutorial

Scikit-Learn tutorial material for Scipy 2015
Python
578
star
4

scipy-2016-sklearn

Scikit-learn tutorial at SciPy2016
Jupyter Notebook
515
star
5

COMS4995-s19

COMS W4995 Applied Machine Learning - Spring 19
Jupyter Notebook
303
star
6

scipy-2017-sklearn

Scipy 2017 scikit-learn tutorial by Alex Gramfort and Andreas Mueller
Jupyter Notebook
283
star
7

scipy-2018-sklearn

Scipy 2018 scikit-learn tutorial by Guillaume Lemaitre and Andreas Mueller
Jupyter Notebook
246
star
8

COMS4995-s20

COMS W4995 Applied Machine Learning - Spring 20
Jupyter Notebook
245
star
9

mglearn

mglearn helper package for "Introduction to Machine Learning with Python"
Python
229
star
10

ml-training-intro

Materials for the "Introduction to Machine Learning" class
HTML
227
star
11

ml-training-advanced

Materials for the "Advanced Scikit-learn" class in the afternoon
Jupyter Notebook
163
star
12

ml-workshop-4-of-4

Advanced Machine Learning with Scikit-learn part II
HTML
162
star
13

COMS4995-s18

COMS W4995 Applied Machine Learning - Spring 18
Jupyter Notebook
158
star
14

kaggle_insults

Kaggle Submission for "Detecting Insults in Social Commentary"
Python
153
star
15

ml-workshop-3-of-4

Advanced Machine Learning with Scikit-learn part I
HTML
139
star
16

gco_python

Python wrappers for GCO alpha-expansion and alpha-beta-swaps
Python
131
star
17

ml-workshop-2-of-4

Intermediate Machine Learning with Scikit-learn, 4h interactive workshop
HTML
125
star
18

advanced_training

Advanced Scikit-learn training session
Jupyter Notebook
120
star
19

futurepast

Deprecation tools for Python
Python
118
star
20

talks_odt

Slides and materials for most of my talks by year
Jupyter Notebook
89
star
21

applied_ml_spring_2017

Website and material for the FIXME course on Practical Machine Learning
Jupyter Notebook
88
star
22

odscon-2015

Slides and material for open data science
80
star
23

odscon-sf-2015

Material for ODSCON San Francisco 2015
Jupyter Notebook
79
star
24

aml

Applied Machine Learning with Python
Jupyter Notebook
76
star
25

quick-ml-intro

One hour interactive training for ML with scikit-learn
Jupyter Notebook
74
star
26

pydata-nyc-advanced-sklearn

Notebooks (and slides) for my PyData NYC 2014 tutorial on the more advanced features of scikit-learn.
69
star
27

sklearn_tutorial

Slides for quick intro to machine learning with sklearn
CSS
65
star
28

sklearn-one-day

One day workshop for machine learning with scikit-learn
HTML
63
star
29

segmentation

Superpixel based semantic segmentation
Python
53
star
30

scikit-learn-interactive-tutorial

IPython notebooks and data an interactive scikit-learn tutorial.
51
star
31

pydata-strata-2015

Slides and notebooks for PyData Strata San Jose
51
star
32

patsylearn

Patsy Adaptors for Scikit-learn
Python
49
star
33

advanced_git_nyu_2016

Advanced git and github course material
HTML
39
star
34

textonboost

Texton boost implementation in C++ by Philipp Kraehenbuehl
C++
32
star
35

pydata-amsterdam-2016

Machine Learning with Scikit-Learn (material for pydata Amsterdam 2016)
Jupyter Notebook
30
star
36

ml_meetup_nyc_2016

Material for Machine Learning Meetup "Machine Learning with Scikit-learn"
Jupyter Notebook
29
star
37

odsc_east_2016

Jupyter Notebook
26
star
38

speed_reading

Speed reading app with running focus
CSS
25
star
39

slic-python

SLIC wrapper for Python - legacy, rather use scikit-image now!
C++
23
star
40

ml-workshop-short

Two hour interactive machine learning workshop
HTML
22
star
41

mlss_2015

Material for open source machine learning practical
Python
21
star
42

jupytercon2017

Material for Data analysis and machine learning in Jupyter
Jupyter Notebook
21
star
43

structured-prediction-workshop

Introduction to structured prediction with Python and pystruct
TeX
18
star
44

information-theoretic-mst

Information Theoretic Clustering using Minimum Spanning Trees
Python
18
star
45

advanced-sklearn-boston-nlp-2016

Material and slides for Boston NLP meetup May 23rd 2016
Jupyter Notebook
17
star
46

nyu_ml_lectures

Materials for NYU Machine Learning Guest Lectures
Python
17
star
47

amueller.github.io

Less
17
star
48

ImageNet-parsing-Python

Python class to explore the ImageNet database
Python
16
star
49

water_hackweek_2020_machine_learning

Water Hackweek Machine Learning workshop
Jupyter Notebook
15
star
50

strata-nyc-2016

Materials fort Strata NYC 2016 scikit-learn tutorial
Jupyter Notebook
15
star
51

damascene-python-and-matlab-bindings

Python and matlab bindings for the Damascene CUDA implementation of gPB
C++
13
star
52

git_workshop

Material for git workshop
HTML
11
star
53

strata_singapore_2015

Materials for Strata Singapore "Machine learning In Python with scikit-learn" tutorial.
Jupyter Notebook
9
star
54

sklearn_workshop

Jupyter notebooks for interactive scikit-learn workshop
Python
8
star
55

cv

Curriculum Vitae
TeX
7
star
56

datasets

Datasets of some standard computer vision / deep learning benchmarks
Python
7
star
57

GPU-Quickshift-Python-Bindings

Python bindings for Brian Fultersons really quick shift
C++
7
star
58

structured_prediction_talk

Slides for explaining structured prediction and PyStruct
TeX
6
star
59

oss-directions-webinar-2019

Open Source Directions webinar materials
Jupyter Notebook
6
star
60

intro_to_ml_cuny_2015

Introduction to machine learning for CUNY
5
star
61

columbia-website

My official columbia page
CSS
5
star
62

phd-thesis-segmentation

unearthing my thesis - this is a backup
TeX
4
star
63

figures

Some figures and drawings for talks
3
star
64

daimrf

Python interface for inference with LibDAI
Python
2
star
65

notebooks

Random notebooks
2
star
66

vim-config

Vim Script
2
star
67

nsf-biosketch

stand-alone nsf biosketch
TeX
2
star
68

oss_workshop

Demo repository for oss workshop
1
star
69

dask-learn

Python
1
star
70

CZI-sklearn

TeX
1
star
71

gah

Code I don't want to keep reimplementing all the time
1
star
72

dotfiles

Another try to manage my dotfiles
Shell
1
star
73

icra_2014_crf_nyu

ICRA 2014 paper on crfs for semantic segmenation on the nyu dataset
TeX
1
star