• Stars
    star
    120
  • Rank 295,983 (Top 6 %)
  • Language
    Python
  • Created about 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'

Keeping your ML model in shape with Kafka, Airflow and MLFlow

How to incrementally update your ML model in an automated way as new training data becomes available

Fitting and serving your machine learning (ML) model is one thing, but what about keeping it in shape over time?

Let's say we got a ML model that has been put in production and is actively serving predictions. Simultaneously, we got new training data that becomes available in a streaming way while users use the model. Incrementally updating the model with new data can improve the model, whilst it also might reduce model drift. However, it often comes with additional overhead. Luckily, there are tools that allow you to automate many parts of this process. 

This repository takes on the topic of incrementally updating a ML model as new data becomes available. It mainly leans on three nifty tools, being Kafka, Airflow, and MLFlow.

The corresponding walkthrough/post on Medium lays out the workings of this repo step-by-step.

More Repositories

1

Predicting_real_estate_prices_using_scikit-learn

Predicting Amsterdam house / real estate prices using Ordinary Least Squares-, XGBoost-, KNN-, Lasso-, Ridge-, Polynomial-, Random Forest-, and Neural Network MLP Regression (via scikit-learn)
Python
154
star
2

Python_Portfolio__VaR_Tool

Python-based portfolio / stock widget which sources data from Yahoo Finance and calculates different types of Value-at-Risk (VaR) metrics and many other (ex-post) risk/return characteristics both on an individual stock and portfolio-basis, stand-alone and vs. a benchmark of choice (constructed with wxPython)
Python
116
star
3

dockerized_data_science_playground

Multi-docker container data science / engineering playground (w/ Kafka, Airflow, MLFlow, Tensorflow-Keras / SKLearn) for simulating a microservices-oriented architecture
Dockerfile
10
star
4

Solvency_II_Equity_Risk_Capital_Charge

Python script for calculating the (type I) equity risk solvency capital charge ("SCR") under Solvency II
Python
7
star
5

Hyperopt

Repo that relates to the Medium blog 'Using Bayesian Optimization to reduce the time spent on hyperparameter tuning'
Jupyter Notebook
7
star
6

Solvency_II_Spread_Risk_Capital_Charge

Python script for calculating the spread risk solvency capital charge ("SCR") for a bond portfolio under Solvency II (along the standard formula)
Python
6
star
7

Django-local-community-football-platform

Web-based local community football platform built on the Django web framework (with the help of Python, Bootstrap3, GeoPy, and GoogleMaps-API)
HTML
4
star
8

Mean_Variance_Portfolio_Optimization_with_Carbon_Intensity_Constraints

Python script for single period Mean-Variance Optimization (MVO) with scope 1+2 carbon intensity constraints
Python
4
star
9

Amsterdam-Airport-Schiphol-Flight-Data-App

Basic API-sourced python-based flight data widget for retrieving arrival and departures data for Amsterdam Airport Schiphol (along a GUI constructed with wxPython)
Python
1
star