• Stars
    star
    3
  • Rank 3,963,521 (Top 79 %)
  • Language
    Shell
  • Created over 7 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Hive hybrid storage mechanism to reduce storage cost exponentially utilizing hot data in hdfs and cold data in S3 storage

More Repositories

1

Stock_Market_Prediction_with_SnP500

This project would demonstrate the following capabilities: 1. Extraction Loading and Transformation of S&P 500 data and company fundamentals. 2. Exploratory and Time Series Data Analysis on top of the stock data. 3. Stock Screener based on fundamentals. 4. Stock Price Prediction using multiple and/or an ensemble of machine learning models.
Jupyter Notebook
7
star
2

mlops-github-actions

Set up a data science or machine learning project with automated training and deployment using GitHub Actions and Azure Machine Learning.
Python
4
star
3

Manufacturing-Quality-Inspection

I have built the computer vision models in 3 different ways addressing different personas, because not all companies will have a resolute data science team.
Jupyter Notebook
4
star
4

DataTransfer

Generic HDFS data and Hive Database transfer automation between any environment(Production/QA/Development) utilizing Amazon S3 storage
Shell
3
star
5

Ten_Minute_ChatBot_Python

Python
2
star
6

FindMyImage

Use AI, Search and exhaust data mining to resize, arrange, tag, categorize, generate caption and search through all your images in a flash.
Python
2
star
7

Complete-EDA-and-Sentiment-Analysis

In depth EDA, Sentiment Analysis Model Building Evaluation and Selection, Model deployment
Jupyter Notebook
2
star
8

Stock_Sentiment_Analysis

Scrub the stock related news and perform sentiment analysis on top of that.
Jupyter Notebook
1
star
9

EncryptedDataTransfer

HDFS Encrypted zone intra-cluster transfer automation
Shell
1
star
10

MLOpsPy

MLOps with Azure ML
Python
1
star
11

Movies-Data-Consortium

The Movies dataset is extraordinarily rich in nature and a lot of interesting data science and exploratory data analytics analysis can be done using it. In this project I have created a movies data consortium by blending a file data store sourced from the Movies Dataset hosted in Kaggle, website data from Wikipedia and API data from themoviedb.org.
Jupyter Notebook
1
star
12

databricks-workshops

Scala
1
star
13

meetupHouston100819

helper for demo
1
star
14

AI_Enabled_Image_Bucketization

Bucketize an image based on exhaust data and AI generated data
Python
1
star
15

New_Mexico_Well_Data

Production data from New Mexico is published each month in a zipped, 36 GB XML file. It contains data for 55,000 wells over the past 30 years. The file grows in size by 300 MB per month. This repository uses spark code to process the file.
Python
1
star
16

AIR-TRAVEL-SAFETY-DASHBOARD

No other form of transportation is as scrutinized, investigated and monitored as commercial aviation. Yet there are compelling statistics and figures that prove airline transportation to be the safest way to travel. In fact, based on odds of dying statistics, if someone did fly every day of their life, probability indicates that it would take more than seventeen thousand years before that person would succumb to a fatal accident. Seventeen thousand years!
1
star