• Stars
    star
    3
  • Rank 3,963,521 (Top 79 %)
  • Language
    Scala
  • Created almost 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Collection of code for submitting Spark/Hadoop/Hive/Pig tasks to EMR (AWS Elastic MapReduce) | #DE

More Repositories

1

CS_basics

My CS learning : algorithm, data structure, and system design | #SE
Python
64
star
2

NYC_Taxi_Pipeline

Design/Implement stream/batch architecture on NYC taxi data | #DE
Scala
22
star
3

DE-100-days

data engineering 100 days πŸ€– 🧲 🦾 | #DE
20
star
4

NYC_Taxi_Trip_Duration

Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS
Jupyter Notebook
16
star
5

knowledge_base_repo

Resources for software/backend/data learning | #SE | #DE | #DS
Shell
16
star
6

utility_Python

Collection of Python utility scripts & OOP basic demo | #SE
Python
14
star
7

analysis

Repo for practical data science problems approaches, including notebook demo and working scripts | #DS | #analysis
Jupyter Notebook
12
star
8

spark-etl-pipeline

Various data stream/batch process demo with Apache Scala Spark πŸš€
Scala
11
star
9

web_scraping

Collect/process data via various data sources : website / js website / API. Run scrapping pipeline via Celery, and Travis cron task. Dump the scraped data to slack
Jupyter Notebook
11
star
10

AirflowJob

Airflow POC demo : 1) env set up 2) airflow DAG 3) Spark/ML pipeline | #DE
Python
11
star
11

spotify_recSys_challenge2018

Python
6
star
12

airflow-ubuntu-dev

deploy airflow to heroku demo
Shell
6
star
13

data_infra_repo

Collections of POC/dev data infrastructure. | #SE
Python
6
star
14

mlflow-heroku-dev

deploy mlflow to heroku demo
Shell
5
star
15

YelpReviews

Build an end to end data application with Yelp review dataset. (data collect -> DB config -> data ETL -> data dashboard (analysis/ML)
Python
5
star
16

SpringPlayground

Java Spring (Boot/Cloud..) backend playground | #SE
JavaScript
3
star
17

KafkaSparkPoc

Build Kafka <-> Spark, Kafka <-> Kafka .. stream processing POC | #DE
Scala
3
star
18

utility_Scala

Scala programming language basic & functional programming (fp) & and other utility cases | #SE
Scala
3
star
19

movie_recommendation

Build recommendation algorithms via movie data that can be potentially applied to customized content such as music, and ecommerce
Jupyter Notebook
2
star
20

ScalaPlayground

Scala POCs : API, backend, tools | #SE
Scala
2
star
21

FinatraHelloWorld

Build REST API via Scala via Finatra framework
Scala
2
star
22

KafkaHelloWorld

Kafka application/infra POC | #SE
Scala
2
star
23

spotify_recommend_playlist

Generate suggested song playlists via "Tinder like process". Leverage Spotify recommendation API and ML
JavaScript
2
star
24

dj-playground

Django playground : web app, backend | #SE
Python
2
star
25

utility_shell

Collection of shell/Bash scripts for various using cases | #SE
Shell
2
star
26

SGTaxiMap

SG taxi real-time heat map with gov.SG API and python flask, JS leaflet library. Docker/Travis CI integrated
Python
2
star
27

til

Today I Learned
2
star
28

spark-scala-word-count

Run a simple spark word count job via 1) scala sbt spark 2) docker
Scala
2
star
29

KKBox_Music_Recommendation

Jupyter Notebook
1
star
30

RabbitMQHaloWorld

// RabbitMQ hello world project
Python
1
star
31

repos

1
star
32

data-platform-poc

1
star
33

JS_Playground

Javascript code playground
HTML
1
star
34

Kaggle.com_mini_project

public data competitions & projects on the https://www.kaggle.com/
Jupyter Notebook
1
star
35

LambdaHelloWorld

AWS Lambda demo projects
Java
1
star
36

Redshift-poc

1
star
37

HousePricePredAPI

ML api predict house price wrapped in Docker and deployed to AWS ECS/Fargate | #DE |#ML
Python
1
star
38

JavaHelloWorld

Java basics project : data structure, algorithm, syntax, concept
Java
1
star
39

CDKPoc

https://cdkworkshop.com/
TypeScript
1
star
40

IreHouse

1
star
41

yennj12_blog_V2

My current blog (via fastpages template)
Jupyter Notebook
1
star
42

ScalaCoursera

My learning on Functional Programming Coursea Scala Course | #SE
Scala
1
star
43

GitCommitQ

ETL collect Github commits data (no API account required)
Python
1
star