• Stars
    star
    16
  • Rank 1,311,288 (Top 26 %)
  • Language
    Python
  • Created about 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered hive table performance comparison

More Repositories

1

Facebook-Data-Extraction

#DataPipeLine #ETL - Created is a Facebook data extraction utility to extract the publicly available data on Facebook. Used Facebook Graph API and Python to extract the data and loaded the data into the CSV files for further analysis.
Python
14
star
2

spark-slowly-changing-dimension

Spark implementation of Slowly Changing Dimension type 2
Scala
11
star
3

YouTube-comments-Spam-Detector

YouTube Spam comments classifier using Naive Bayes and SVM
Jupyter Notebook
3
star
4

Spark-Practice-Repository

Apache Spark practice (Core API, Data Frames and Spark SQL) using Python
Python
2
star
5

Defensive-Forecasting

Defensive Forecasting is an online forecasting technique for Binary Labels
Python
2
star
6

No-Show-Patients-Analysis

Model to predict patients who will likely to miss the booked appointment using logistic regression and random forest machine learning techniques
R
2
star
7

Machine-Learning-Assignments-1

Machine learning course assignments
R
1
star
8

Kiva-Loan-Data-Warehouse

#Pyspark#HDFS#Spark#DataAnalysis - Kiva loan data mart will be used to transform and analyse Kiva loan data and to help understand lenders targeted loan community
Python
1
star
9

Stayzilla-Operation-Failure-Analysis

The project is to analyze the operation failure of the Stayzilla.com (an Airbnb like startup in India)
R
1
star
10

Exploratory-and-Descriptive-Analysis-on-Loan-Dataset

The project is to conduct a set of exploratory analysis and performing various machine learning techniques to predict loan borrowerโ€™s default rate with that we have tried various data visualization techniques to show data distribution
R
1
star
11

Intro-to-Data-Science-Assignment

Introduction to Data Science course Assignments
Jupyter Notebook
1
star
12

Text-Language-Detector

Language Detector program using Google Translator API
Python
1
star