• Stars
    star
    161
  • Rank 225,892 (Top 5 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.

Lambda architecture

Alt text

Alt text

Read about the project here

Watch the videos demonstrating the project here

Our Lambda project receives real-time IoT Data Events coming from Connected Vehicles, then ingested to Spark through Kafka. Using the Spark streaming API, we processed and analysed IoT data events and transformed them into vehicle information. While simultaneously the data is also stored into HDFS for Batch processing. We performed a series of stateless and stateful transformation using Spark streaming API on streams and persisted them to Cassandra database tables. In order to get accurate views, we also perform a batch processing and generating a batch view into Cassandra. We developed responsive web traffic monitoring dashboard using Spring Boot, SockJs and Bootstrap which get the views from the Cassandra database and push to the UI using web socket.

All component parts are dynamically managed using Docker, which means you don't need to worry about setting up your local environment, the only thing you need is to have Docker installed.

System stack:

  • Java 11
  • Maven
  • ZooKeeper
  • Kafka
  • Cassandra
  • Spark 3
  • Docker
  • HDFS

The streaming part of the project was done from iot-traffic-project InfoQ

How to use

  • mvn package
  • docker-compose -p lambda up
  • Wait all services be up and running, then...
  • ./project-orchestrate.sh
  • Run realtime job docker exec spark-master /spark/bin/spark-submit --class com.apssouza.iot.streaming.StreamingProcessor --master spark://localhost:7077 /opt/spark-data/iot-spark-processor-1.0.0.jar
  • Access the Spark cluster http://localhost:8080
  • Run the traffic producer java -jar iot-kafka-producer/target/iot-kafka-producer-1.0.0.jar
  • Run the service layer (Web app) java -jar iot-springboot-dashboard/target/iot-springboot-dashboard-1.0.0.jar
  • Access the dashboard with the data http://localhost:3000/
  • Run the batch job docker exec spark-master /spark/bin/spark-submit --class com.apssouza.iot.batch.BatchProcessor --master spark://localhost:7077 /opt/spark-data/iot-spark-processor-1.0.0.jar
  • Run the ML job docker exec spark-master /spark/bin/spark-submit --class com.apssouza.iot.ml.SpeedPrediction --master spark://localhost:7077 /opt/spark-data/iot-spark-processor-1.0.0.jar

Miscellaneous

Spark

Submit a job to master

  • mvn package
  • spark-submit --class com.apssouza.iot.streaming.StreamingProcessor --master spark://spark-master:7077 iot-spark-processor/target/iot-spark-processor-1.0.0.jar Add spark-master to /etc/hosts pointing to localhost

GUI

http://localhost:8080 Master http://localhost:8081 Slave

HDFS

Commands https://hortonworks.com/tutorial/manage-files-on-hdfs-via-cli-ambari-files-view/section/1/

Open a file - http://localhost:50070/webhdfs/v1/path/to/file/file.csv?op=open

Web file handle - https://hadoop.apache.org/docs/r1.0.4/webhdfs.html

Commands :

Gui

http://localhost:9870 http://localhost:50075

Kafka

  • kafka-topics --create --topic iot-data-event --partitions 1 --replication-factor 1 --if-not-exists --zookeeper zookeeper:2181
  • kafka-console-producer --request-required-acks 1 --broker-list kafka:9092 --topic iot-data-event
  • kafka-console-consumer --bootstrap-server kafka:9092 --topic iot-data-event
  • kafka-topics --list --zookeeper zookeeper:2181

Cassandra

  • Log in docker exec -it cassandra-iot cqlsh --username cassandra --password cassandra
  • Access the keyspace use TrafficKeySpace;
  • List data SELECT * FROM TrafficKeySpace.Total_Traffic;

Please consider leaving a star if this project has helped you.

More Repositories

1

java-microservice

A full microservice architecture with Java, Spring Cloud, Log management with ELK, Server load balancing with Nginx, Infrastructure management with Docker-compose, JMX application monitoring, JWT, Aspect OP, Distributed events with Kafka, Event Sourcing, CQRS, REST, Web Sockets, Continuous deploy with Jenkins and more
Java
357
star
2

grpc-production-go

A gRPC production-ready library
Go
189
star
3

chatflow

Leveraging LLM to build Conversational UIs
TypeScript
110
star
4

trading-system

An open-source backtesting and live trading platform for using to foreign exchange
Java
64
star
5

modern-api-management

A modern approach to manage APIs effectively using Protobuf
Shell
52
star
6

service-mesh-istio

A microservice project leveraging Service Mesh with advanced features from Istio
41
star
7

smart-drone

This project leverage Machine learning/Computer vision to make a low-cost Drone smarter and autonomous.
JavaScript
24
star
8

computer-vision

A collection of computer vision projects
Jupyter Notebook
19
star
9

neuroevolution

In this project we combine Artificial Neural Network and Genetics Algorithms to build powerful AI
JavaScript
15
star
10

neuralnet-browser

Artificial Neural Network from scratch using Javascript on the browser
CSS
11
star
11

cnn-for-devs

A project to teach Convolution Neural Network for devs
Jupyter Notebook
10
star
12

grpc-production-java

A Grpc server production ready example
Java
8
star
13

build-deploy

A build deploy docker image to work with Java application and AWS
Shell
5
star
14

istio-and-minikube

Customizable Istio installation for Minikube
4
star
15

image-edit

Image handler
PHP
3
star
16

k8s-microservices

The state of the art in microservices
Shell
3
star
17

admin2014

Meu admin
JavaScript
2
star
18

githooks

Easy to use git hooks
Shell
2
star
19

trading-robot

A strategy to bit the Fx market
Python
2
star
20

helpers

MInhas classes helpers que me ajudam no dia a dia
PHP
2
star
21

apssouza22

About me
2
star
22

video-chat-rtc

This is a video chat app using WebRTC and WebSockets. It is built using Node.js, Express, and Socket.io.
JavaScript
2
star
23

portfolio

Desenvolvimento do meu portfolio profissional
JavaScript
2
star
24

chat-commander-ui

JavaScript
2
star
25

lambda-integration-test

Project example of lambda integration test with AWS CDK + SAM + Docker + Docker-compose + Mock server
TypeScript
1
star
26

protobuf-gen-code

Generated Go code from the Protobuf-api-management repo
1
star
27

noblocking

Testing noblocking IO with PHP, using socket and curl
PHP
1
star
28

tmc

PHP
1
star
29

js-canvas

Projetos diversos usando canvas
JavaScript
1
star
30

shell-script

Meus shell-scripts que me ajudam no dia a dia
Shell
1
star
31

project-setup

An example of a Java project setup
Java
1
star
32

js-inputsearch

Javascript que facilita a busca a partir de uma fonte de dados local ou remota
JavaScript
1
star
33

blog

My blog posts
SCSS
1
star
34

java-effective

Example of a Todo project using JEE features
Java
1
star
35

angular2

Some Angular test projects
JavaScript
1
star
36

js-validate

Uma versão simples de plugin de validação
JavaScript
1
star
37

spl-navidareal

Exemplos de funcionalidades da SPL que utilizo no meu ciclo de desenvolvimento
PHP
1
star
38

gauge-python-api-example

Using Gauge and Python to test REST API
Python
1
star
39

spring-modules

Working with maven modules
Java
1
star
40

eta-chome-extension

A chrome extension to help test Eta UI
JavaScript
1
star
41

js-tab

Classe que facilita o trabalho com abas
PHP
1
star
42

html5_experience

Experiencias em html5 + responsive designer + facebook
JavaScript
1
star
43

js-sexy

Projetos diversos de javascript
JavaScript
1
star
44

my-dao

Minha estrutura de classes de abstração de banco de dados
PHP
1
star
45

first-java-project

Códigos do primeiro projeto Java versão simplificada
Java
1
star