• Stars
    star
    3
  • Rank 3,840,889 (Top 78 %)
  • Language
    Scala
  • Created over 2 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Bigdata processing (Realtime ETL DataPipeline) using Avro Schema Registry, Spark, Kafka, HDFS, Hive, Scala, docker, spark-streaming

More Repositories

1

Supervised-Binary-Classifier-For-IoT-Data-Stream

Supervised Binary Classifier For IoT Data Stream
Jupyter Notebook
3
star
2

Kafka-Stream-Data-Pipeline-Near-Real-Time

Stream data into pipeline in near-real-time using Kafka
Scala
2
star
3

PySpark-Recommender-System

With this activity, I warmup myself to get a practical hands-on of recommender systems in Spark. We will use the MovieLens dataset sample provided with Spark and available in directory `data`.
Jupyter Notebook
2
star
4

Bixi-Cloud-ETL-Data-Pipeline-using-Scala-Hive-AWS_Athena_JDBC-Driver

An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.
Scala
2
star
5

IntelligentSIDC_JAVA_ADT

Student Identification Code, an implementation of java ADT for O(1) to O(n) for insert, update, delete, find operations for large scale data
Java
2
star
6

Spark_Big_Data_Processing_SQL

Spark Big Data Processing learning
Jupyter Notebook
1
star
7

Tweet_Analytics_WebSockets_SBT_build

Tweet Analystics web Sockets application using play framework with sbt build.
Java
1
star
8

SOEN-7481_Research_Project_Paper

Research project report in IEEE format on title "A Study on Clone Detection within Stack Overflow Code Snippets"
TeX
1
star
9

Code-Smell-Prediction_ML

Python: Code smell prediction using ML (10 cross-validations)
Jupyter Notebook
1
star
10

Evolution-of-Stack-Overflow

Studying the rise and the fall of Stack Overflow over the years, research in data mining
Jupyter Notebook
1
star
11

TwitterLytics-using-play-scala-java-sbt

This hands-on project is for practice play framework, reactive programming using Java stream, scala and sbt tool
Java
1
star
12

Tool3-CalculateHTML-Tags-From-SO-Posts

Java
1
star
13

Metrics-Extraction-from-GitHub-MSR_Python

The main task of the tool is extract metrics (e.g., #commits, #tags, #authors etc from the projects) from GitHub projects according to the settings that are related to the people i.e. the number of the developer in the project, the number of commits per project, the number of commits per developer in a project and so on using Python
Python
1
star
14

Tool1-Extraction-of-SO-JAVA-CodeSnippets

This tool will Extract Stack Overflow code snippets from CSV files
Java
1
star
15

Scala-Avro-Confluent-Schema-Registry

Scala
1
star
16

Evolution-of-the-Stack-Overflow-Over-the-Years

Research Project: Evolution of the Stack Overflow Over the Years using R, Stack Overflow Data Dump, MSSQL, Python
Jupyter Notebook
1
star
17

Tool2-ParseHTML-Results-Of-NiCad

Java
1
star
18

Code-Smell-Java-Code-analysis-tool

Exception handling anti-pattern code analysis tool
Java
1
star
19

Programming-with-Hive-JDBC-using-Scala-and-HDFS

Programming with Hive JDBC using Scala and HDFS
Scala
1
star
20

SQL-Scripts-For-SO-Clone-Detection

TSQL
1
star
21

Java-Problem-Solving

Java
1
star
22

AI-Face-Mask-Detector

Jupyter Notebook
1
star
23

Java-Code-Extraction-From-Stack-Overflow

Java
1
star
24

ManikHossain08

Big Data Engineer at Bell Canada
1
star
25

R-Mixed-Model-Cloned-Q-A

R
1
star
26

Spark-ETL-Data-Pipeline-using-SparkStreaming-HDFS-Kafka-Hive

The objectives of this project are to get experience of coding with: Spark, Spark SQL, Spark Streaming, Kafka, Scala and functional programming
Scala
1
star
27

Bug-prediction-Model-Building-Using-R

How well can you predict post release bugs?
R
1
star
28

Java-OOP-Basic-Concept

Understanding of java basic concept
HTML
1
star
29

Sales-Analysis-Bike-Shops-In-R

1
star
30

Python-Programming

1
star
31

Real-State-Project

Real State project Using C#.Net (Server api based) and angularJs(Client with html)
C#
1
star
32

JAVA-Design-Patterns-Software-Development-Methodologies

Applying different design pattern in different java projects. Designing and Implementing (Some of) Dungeon and Dragons Character Classes using various java Design patterns in java.
Java
1
star
33

ETL-Data-Pipeline-using-HDFS-Hive-Scala

ETL-Data-Pipeline-using-HDFS-Hive-Scala
Scala
1
star
34

Code-Clone-within-SO-Mixed_Model_Building

Research Project: A study on clone detection within the Stack Overflow code snippets. Vote Variance Factors Analysis of Similar Stack Overflow Posts (Q&A) by building Mixed Modelling.
TeX
1
star
35

STM-Data-Enrichedment-With-Hadoop-Scala

STM data enrichment, Extract, Transform, Load (e.g., ETL)
Scala
1
star
36

SQL-HackerRank-Hard-Challenges-MSSQL

My own solution of Hackerrank SQL hard challenges.
TSQL
1
star
37

Scala-Programming-With-SBT

Scala basic from beginner to advance level. This languages is very intuitive to use and less code to write and there is no verbosity like JAVA.
Scala
1
star
38

Customer-Segmentation_K-Means-Clustering-in-R

Which stock prices behave similarly? Organization wants to know which companies are similar to each other to help in identifying potential customers of a SAAS software solution (e.g. Salesforce CRM or equivalent) in various segments of the market. The Sales Department is very interested in this analysis, which will help them more easily penetrate various market segments.
R
1
star