• Stars
    star
    282
  • Rank 146,549 (Top 3 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data

Deep Learning UDF for KSQL / ksqlDB for Streaming Anomaly Detection of MQTT IoT Sensor Data

I built a KSQL UDF for sensor analytics. It leverages the new API features of KSQL to build UDF / UDAF functions easily with Java to do continuous stream processing on incoming events. If you want to build your own UDF, please check out this blog post for a detailed "how to" and potential issues during development and testing: How to Build a UDF and/or UDAF in KSQL 5.0.

Use Case: Connected Cars - Real Time Streaming Analytics using Deep Learning

Continuously process millions of events from connected devices (sensors of cars in this example):

Architecture: Sensor Data via Confluent MQTT Proxy to Kafka Cluster for KSQL Processing and Real Time Analytics

This project focuses on the ingestion of data into Kafka via MQTT and processing of data via KSQL:

Live Demo Video - MQTT with Kafka Connect and MQTT Proxy

If you want to see Apache Kafka / MQTT integration in a video, please check out the following 15min recording showing a demo my two Github examples:

Apache Kafka + MQTT Integration

Source Code

Here is the full source code for the Anomaly Detection KSQL UDF.

It is pretty easy to develop UDFs. Just implement the function in one Java method within a UDF class:

            @Udf(description = "apply analytic model to sensor input")
            public String anomaly
                (@UdfParameter("sensorinput") String sensorinput)
            { "YOUR BUSINESS LOGIC" }

How to run it?

Requirements

The code is developed and tested on Mac and Linux operating systems. As Kafka does not support and work well on Windows, this is not tested at all.

  • Java 8
  • Confluent Platform 5.4+ (Confluent Enterprise if you want to use the Confluent MQTT Proxy, Confluent Open Source if you just want to run the KSQL UDF and send data via kafkacat instead of MQTT)
  • MQTT Client (I use Mosquitto in the demo as MQTT Client to publish MQTT messages - I don't even start the MQTT server! Thus, you can also use any other MQTT Client instead.)
  • kafkacat (optional - if you do not want to use MQTT Producers, and of course you can also use kafka-console-producer instead, but kafkacat is much more comfortable)

Step-by-step demo

Install Confluent Platform and Mosquitto (or any other MQTT Client).

Then follow these steps to deploy the UDF, create MQTT events and process them via KSQL leveraging the analytic model.

Other information and projects related to Kafka and MQTT

If you want to find more details about Kafka + MQTT integration, take a look at my slides from Kafka Summit 2018 in San Francisco: IoT Integration with MQTT and Apache Kafka. The video recording is available on the website of Kafka Summit for free: Kafka MQTT Integration - Video Recording.

To see the other part (integration with sink applications like Elasticsearch / Grafana), please take a look at the project "KSQL for streaming IoT data", which shows how to realize the integration with ElasticSearch via Kafka Connect.

The Github project Apache Kafka + Kafka Connect + MQTT Connector + Sensor Data also integrates with MQTT devices. Though, this project uses Confluent's MQTT Connector for Kafka Connect, i.e. a different approach where you use a MQTT Broker in between the devices and Kafka.

If you want to see a more powerful and complete demo, check out Streaming Machine Learning at Scale from 100000 IoT Devices with HiveMQ, Apache Kafka and TensorFLow.

More Repositories

1

kafka-streams-machine-learning-examples

This project contains examples which demonstrate how to deploy analytic models to mission-critical, scalable production environments leveraging Apache Kafka and its Streams API. Models are built with Python, H2O, TensorFlow, Keras, DeepLearning4 and other technologies.
Java
816
star
2

hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference

Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required
Jupyter Notebook
386
star
3

kafka-connect-iot-mqtt-connector-example

Internet of Things Integration Example => Apache Kafka + Kafka Connect + MQTT Connector + Sensor Data
Shell
201
star
4

tensorflow-serving-java-grpc-kafka-streams

Kafka Streams + Java + gRPC + TensorFlow Serving => Stream Processing combined with RPC / Request-Response
Java
142
star
5

python-jupyter-apache-kafka-ksql-tensorflow-keras

Making Machine Learning Simple and Scalable with Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka and KSQL
Jupyter Notebook
94
star
6

ksql-fork-with-deep-learning-function

Deep Learning UDF for KSQL, the Streaming SQL Engine for Apache Kafka with Elasticsearch Sink Example
Java
77
star
7

ksql-machine-learning-udf

Confluent KSQL Addon - User Defined Function (UDF) for Machine Learning
Java
12
star
8

ksql-apache-kafka-quickstart-step-by-step-example

KSQL Step-by-step tutorial using the basic functions of Apache Kafka's Streaming SQL Engine
10
star
9

MavenCamelScala

A working
8
star
10

kafka-streams-machine-learning-docker-microservice

A Kafka Streams Microservice (using Machine Learning) deployed as a Docker Container
Java
8
star
11

kafka-connect-smt-string-filter

Single Message Transformation (SMT) for Kafka Connect to Filter String Messages
Java
7
star
12

camel-infoq

Introduction Demo of Apache Camel (InfoQ article)
Java
6
star
13

Apache-Kafka-PipelineAI-Machine-Learning-Infrastructure

Machine Learning Infrastructure leveraging Apache Kafka Ecosystem, PipelineAI, Kubernetes and Docker
5
star
14

kafka-streams-python-integration-with-jep

Calling Python Code / Script in Kafka Streams with Jep (TensorFlow Keras Example)
2
star
15

tensorflow-tfserving-cloud-apache-kafka-streams

Model Inference with Apache Kafka, Kafka Streams and a TensorFlow model deployed to TF-Serving (Google Cloud ML Engine)
Java
2
star
16

musiccrawler

Erlang
1
star
17

gradlePlayApp

1
star
18

tensorflow-kafka-confluent-google-cloud

Deep Learning at Extreme Scale with TensorFlow and Apache Kafka / Confluent on Google Cloud
Jupyter Notebook
1
star
19

keras-tensorflow-model-inference-kafka-streams-ksql-dl4j

Keras Tensorflow Model deployed for Inference in Kafka Streams / KSQL microservice via Deeplearning4j
Java
1
star