Michael E. (@okmich)

Top repositories

1

cca175notes

Preparatory notes for the Cloudera Spark and Hadoop Certification
Scala
18
star
2

hadoop-training-projects

Projects from my Hadoop training sessions
PigLatin
17
star
3

big-data-olap

Showcasing Online analytical processing with Apache Kylin
Java
8
star
4

rdbms_2_nosql

Course Material for "SQL, NoSQL, Big Data and Hadoop" course.
Java
7
star
5

realtime-data-collect-agg

Java
5
star
6

kafka-realtime-etl

Exploring an end-to-end data pipeline with Kafka https://www.dezyre.com/hackerday/streaming-etl-in-kafka-with-ksql
Java
5
star
7

sample-data-pipeline

A simple implementation of a batch data pipeline using spark, kafka and hive. https://www.dezyre.com/hackerday/build-a-data-pipeline-based-on-messaging-using-spark-and-hive
Java
5
star
8

fx-usd-non-farm-payroll

Demonstrating and analyzing the impact of the US Non Farm payroll news on the forex market prices for three currency pairs (EURUSD, GBPUSD, USDJPY) since 2008 till June 2017. The fx dataset set for this analysis was downloaded from http://www.histdata.com/. It was further transformed and the transformed version can be downloaded from https://drive.google.com/drive/folders/0B0MdkEsxQHAQclJxamJPdjJ4UkU?usp=sharing
Scala
5
star
9

log-file-processing

Course content for hackerdays - "Processing web server log" and "Real time log processing using streaming architecture" - https://www.dezyre.com/hackerday/real-time-log-processing-using-streaming-architecture
Java
5
star
10

spark-graphx-call-analysis

Analysis of Community Interactions using Spark GraphX
Scala
4
star
11

bigdataretail

Scala
3
star
12

movielens-lambda

Java
3
star
13

hive_in_depths

Scala
3
star
14

zeppelin-training

Materials for Zeppelin training on hackerday - https://www.dezyre.com/hackerday/data-analysis-collaboration-using-zeppelin
3
star
15

movielens-bigdata

Processing MovieLens dataset using Apache Spark and Hive
Scala
3
star
16

m2m-arch-pipeline

M2M IOT software architecture and use case
Java
2
star
17

githubarchive-analytics

Some cools big data learning learning topics using github dataset
Java
2
star
18

HiveVsImpala

PigLatin
2
star
19

spark-exercises

Exercises in the mastery of Spark 2.0
Scala
2
star
20

spring-boot-neo4j-movielens

Spring Boot application with Neo4J on the movielens dataset
Java
2
star
21

bigdata-sql-engines

Demonstrating varieties of SQL Engine as part of Hackerday - https://www.dezyre.com/hackerday/choosing-the-best-sql-on-hadoop-engine
PigLatin
2
star
22

data-engineering-yelp-dataset

Scala
2
star
23

schoolruns

Java
1
star
24

jumo-mr

Java
1
star
25

insapp-lite

1
star
26

devdataprod014Assignment

Developing Data Products Course Project
R
1
star
27

MetisWFTools

HTML
1
star
28

algorithmic-trading-with-machine-learning

Jupyter Notebook
1
star
29

spark-twitter-sentiment-analysi

Real time twitter sentiment analysis with Spark (scala)
Java
1
star
30

minisure

1
star
31

hackerday-jobportal-service

Course content for hackerday on building a job ad site
PigLatin
1
star
32

get-data-coursera

The course project for the Coursera Specialization Course - Getting and Cleaning Data
R
1
star
33

flume-kafka-dataflow

Demonstration of data flow that involves integrating apache flume and Kafka with conditional routing of data events within the flow
Java
1
star
34

auto-tracking

Demonstration of how to use Redis key-value database to power an real-time auto-tracking system.
Java
1
star
35

designpatterns

Reinforcing Design Patterns with Stories written in java code
Java
1
star