AWS Big Data Projects (@AWS-Big-Data-Projects)

Top repositories

1

aws-forest-fire-predictive-analytics

Big Data Engineering & Analytics Project
Python
28
star
2

Iot-and-Big-Data-Application-using-aws-and-apache-kafka

Iot,Big Data Analytics using Apache-kafka,spark and other aws services
Python
16
star
3

AWS-Data-Lake

AWS Lake Formation makes it easy for you to set up, secure, and manage your data lakes also data discovery using the metadata search capabilities of Lake Formation in the console, and metadata search results restricted by column permissions.
16
star
4

Analyzing-Twitter-in-real-time-with-Kinesis-Lambda-Comprehend-and-ElasticSearch

Analyzing Twitter in real time with Kinesis, Lambda, Comprehend and ElasticSearch
Python
14
star
5

Analysing-Census-Data-using-aws

Use aws-emr and aws-redshift to analyse dataset of adult census of USA
13
star
6

big-data-solutions

This repository provides Code examples written in Python,Spark-Scala using primarily boto3 SDK API methods and aws cli examples for majority of the AWS Big Data services. There are also nicley written Wiki articles for most of the common issues/challenges faced within BigData world.
Python
13
star
7

IoT-Data-with-Amazon-Kinesis

Build a Visualization and Monitoring Dashboard for IoT Data with Amazon Kinesis Analytics and Amazon QuickSight
Python
12
star
8

Run-a-Spark-job-within-Amazon-EMR

Run a Spark job within Amazon EMR
Java
12
star
9

AWS-EMR

Analyzing Big Data with Amazon EMR
12
star
10

Big-Data-Beverage-Recommender-System

Big-Data-Beverage-Recommender-System
Jupyter Notebook
12
star
11

aws-serverless-data-lake-workshop

This workshop is meant to give customers a hands-on experience with mentioned AWS services. Serverless Data Lake workshop helps customers build a cloud-native and future-proof serverless data lake architecture. It allows hands-on time with AWS big data and analytics services including Amazon Kinesis Services for streaming data ingestion
Jupyter Notebook
12
star
12

big-data-ecosystem

Project developed during the Cognizant Cloud Data Engineer Bootcamp on the Digital Innovation One platform with the objective of extracting and counting words from a book in plain text format, displaying the most frequent word, through a python algorithm.
Python
12
star
13

Amazon-Redshift-cluster-to-analyze-USA-Domestic-flight-data

worked with an Amazon Redshift cluster to analyze USA Domestic flight data. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions
12
star
14

Log-Analytics-Solution-With-AWS

Collect, process, and analyze log data using Amazon Kinesis and Elasticsearch Service
11
star
15

Airline_Data_Analysis

Process to gather streaming data from Airline API using NiFi & batch data using AWS redshift using Sqoop and build a data pipeline to analyse the data using Apache Hive and Druid and compare the performances ,to discuss the hive optimization techniques and visualise the data using AWS Quicksight
11
star
16

Data-Analytics-For-Mobile-Games

Player Unknown's Battlegrounds (PUBG), is a first person shooter game where the goal is to be the last player standing. You are placed on a giant circular map that shrinks as the game goes on, and you must find weapons, armor, and other supplies in order to kill other players / teams and survive.
Python
11
star
17

Image-Caption-Generator

In this project, a framework is developed leveraging the capabilities of artificial neural networks to โ€œcaption an image based on its significant featuresโ€.
Python
11
star
18

front-line-concussion-monitoring-system-using-AWS-IoT-and-serverless-data-lakes

A simple, practical, and affordable system for measuring head trauma within the sports environment, subject to the absence of trained medical personnel made using Amazon Kinesis Data Streams, Kinesis Data Analytics, Kinesis Data Firehose, and AWS Lambda
Shell
11
star
19

Analysis-Of-NYC-Yellow-Taxi

The core objective of this project is to analyse the factors for demand for taxis, to find the most pickups, drop-offs of public based on their location, time of most traffic and how to overcome the needs of the public.
Python
11
star
20

big-data-challenge

Your first goal for this assignment will be to perform the ETL process completely in the cloud and upload a DataFrame to an RDS instance. The second goal will be to use PySpark or SQL to perform a statistical analysis of selected data.
Jupyter Notebook
10
star
21

HeartRate-Monitoring-using-AWS-IOT-and-AWS-KINESIS

you run a script to mimic multiple sensors publishing messages on an IoT MQTT topic, with one message published every second. The events get sent to AWS IoT, where an IoT rule is configured. The IoT rule captures all messages and sends them to Firehose. From there, Firehose writes the messages in batches to objects stored in S3. In S3, you set up a table in Athena and use QuickSight to analyze the IoT data.
Python
10
star