• Stars
    star
    232
  • Rank 172,847 (Top 4 %)
  • Language
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this three part series, we will cover MLflow Tracking, Projects, Models, and Model Registry.

Managing the Complete Machine Learning Lifecycle with MLflow

Part 1 of 3

Other parts:

Content for the MLflow Series

Machine Learning (ML) development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.

To solve these challenges, MLflow, an open source project, simplifies the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, encapsulate models that can be used with many existing tools, and central respositry to share models, accelerating the ML lifecycle for organizations of any size.

Goal and Objective

Aimed at beginner or intermediate level, this three-part series aims to educate data scientists or ML developer in how you leverage MLflow as a platform to track experiments, package projects to reproduce runs, use model flavors to deploy in diverse environments, and manage models in a central respository for sharing.

What you will learn

Understand the four main components of open source MLflow——MLflow Tracking, MLflow Projects, MLflow Models, and Model Registry—and how each compopnent helps address challenges of the ML lifecycle.

  • How to use MLflow Tracking to record and query experiments: code, data, config, and results.
  • How to use MLflow Projects packaging format to reproduce runs
  • How to use MLflow Models general format to send models to diverse deployment tools.
  • How to use Model Registry for collaborative model lifecycle management
  • How to use MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics

Instructor


About the MLflow workshop part 1

In this part 1, we will cover:

  • Concepts and motivation behind MLflow
  • Learn how to use Databricks Community Edition (DCE)
  • Tour of the the MLflow API Documentation
  • Introduce MLflow Python Fluent Tracking APIs
  • Walk and work through a three machine learning models using MLflow APIs in the DCE
  • Use the MLflow UI as part of DCE to compare experiment metrics, parameters, and runs

Prerequisites

  • Before the session, please pre-register for Databricks Community Edition
  • Knowledge of Python 3 and programming in general
  • Preferably a UNIX-based, fully-charged laptop with 8-16 GB, with a Chrome or Firefox browser
  • Familiarity with GitHub, git, and an account on Github
  • Some knowledge of Machine Learning concepts, libraries, and frameworks
    • scikit-learn
    • pandas and Numpy
    • matplotlib
  • [optional for part-1] PyCharm/IntelliJ or choice of syntax-based Python editor
  • [optional for part-1] pip/pip3 or conda and Python 3 installed
  • Loads of virtual laughter, curiosity, and a sense of humor ... :-)

Obtaining the Tutorial Material

Familiarity with git is important so that you can get all the material easily during the tutorial and workshop as well as continue to work in your free time, after the session is over.

git clone [email protected]:dmatrix/mlflow-workshop-part-1.git or git clone https://github.com/dmatrix/mlflow-workshop-part-1.git

Documentation Resources

This tutorial will refer to documentation:

  1. MLflow
  2. Numpy
  3. Pandas
  4. Scikit-Learn
  5. Keras
  6. TensorFlow
  7. Matplotlib

How to get started

We will walk through this during the session, but please sign up for Databricks Community Edition before the session :

  1. git clone [email protected]:dmatrix/mlflow-workshop-part-1.git
  2. Use this URL to log into the Databricks Community Edition

  1. Create a ML Runtime 6.5 Cluster

  1. In the brower:
  • (1) Go the GitHub notebooks subdirectory
  • (2) Download MLFlow-CE.dbc file on your laptop

  1. Import the MLFlow-CE.dbc file into the Databricks Community Edition

Let's go!

Cheers,

Jules

More Repositories

1

genai-cookbook

A mixture of Gen AI cookbook recipes for Gen AI applications.
Jupyter Notebook
83
star
2

examples

These are some code examples
HTML
55
star
3

spark-saturday

Workshop for Spark and Databricks
HTML
54
star
4

ray-core-tutorial

Introduction to Ray Core Design Patterns and APIs.
Jupyter Notebook
54
star
5

mlflow-workshop-part-2

Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four part series, we will cover MLflow Tracking, Projects, Models, and Model Registry.
Python
36
star
6

jsd-mlflow-examples

This is a collection of MLflow examples that you can directly run with mlflow command
Jupyter Notebook
30
star
7

mlflow-workshop-part-3

Partly lecture and partly a hands-on tutorial and workshop, this is a three part series on how to get started with MLflow. In this four part series, we will cover MLflow Tracking, Projects, Models, and Model Registry.
Python
30
star
8

tmls-workshop

Toronto Machine Learning MLflow Workshop
Jupyter Notebook
26
star
9

feast_workshops

A series of workshop modules introducing Feast feature store.
Jupyter Notebook
19
star
10

olt-mlflow

O'Reilly Online Training Materials for MLflow
Jupyter Notebook
16
star
11

tutorials

This repository contains all tutorials for Apache Spark, Delta Lake, Koalas, MLflow, and other.
Python
15
star
12

ray-core-serve-tutorial-mlops

A two part tutorial for Ray Core APIs and Ray Serve for Model Deployment
Jupyter Notebook
13
star
13

google-colab

A general repo for creating Colab Jupyter notebooks for MLflow tests and issues
Jupyter Notebook
11
star
14

misc-code

A repo for example code for technologies and platforms: MLflow, PyTorch, Python Language features, Spark and Scala, etc.
Jupyter Notebook
7
star
15

mlflow-workshop-project-expamle-1

This example demonstrates how you can use GitHub projects in MLflow and share it with others to reproduce runs
Python
7
star
16

ds4g-workshop

Jupyter Notebook
3
star
17

ray-misc-examples

All Things Ray!
Jupyter Notebook
1
star