There are no reviews yet. Be the first to send feedback to the community and the maintainers!
uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.apache-spark-docker
Dockerizing an Apache Spark Standalone Clustercsv-schema-inference
A tool to automatically infer columns data types in .csv filesdata-engineer-challenge
Challenge Data Engineerpyspark-on-aws-emr
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.pyDag
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDagDropout-Students-Prediction
The goal of this project is to identify students at risk of dropping out the schooldata-engineering-challenge-th
Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)D3JS-Dashboard
Building Responsive DashBoard with D3.js and ASP.NET MVC from scratch (SQL SERVER - SSIS - API REST)wbz
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.recommendation-system
Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)docker-livy
Dockerizing and Consuming an Apache Livy environmenttext-analysis-speeches-amlo
Text analysis of the speeches, conferences and interviews of the current president of Mexicotf-idf
Term Frequency-Inverse Document Frequency from ScratchHuffman-decoding
A New Approach for Efficient Sequential Decoding of Static Huffman Codesdataengineering-assignment
Prescreening Tasks for Data Engineerdistance-metrics
Distance metrics are one of the most important parts of some machine learning algorithms, supervised and unsupervised learning, it will help us to calculate and measure similarities between numerical values expressed as data pointscsv-estimate-rows
csv-shuffler
A tool to automatically Shuffle lines in .csv fileslivyc
Apache Spark as a Service with Apache Livy ClientMachineLearning
The repository contains basic experiments using machine learning algorithms with pythonRESTful-APIs-Nodejs
Building fast, scalable and secure RESTful services with Node, Express and MongoDBMoving-Average-Spark
How to Compute Moving Average with SparkSparkSQL-with-Python
This repository has some examples of using Spark and SparkSQL with Python through PySparkWittline
Take a look at my repositoryGPU-Programming-with-Python
GPU programming with Python, you can take advantage of the incredible computing power of your graphics processing unit GPU. we will work with NVIDIA’s CUDA library.apache-spark-course
Apache Spark with pythonData-Analytics-with-R
Repository for data analytics course using Roptimizing-public-transportation
Streaming event pipeline around Apache Kafka and its ecosystem. Using public data from the Chicago Transit Authority we will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time.Contextual-Data-Transforms
This repository contain the most important contextual data transformation algorithms which help to improve the rate compression reached by statistical encoders. Ramses Alexander Coraspe ValdezComputer-Vision-and-Deep-Learning
This repository contains information on the basic techniques and algorithms used in computer image processing, in addition to some projects related to pattern recognition using deep learning.csv-generator
wittline.github.io
My github profilePython
Software Analysis, Design and Construction with Pythonmodel-catalog-grpc
A gRPC service to consume any machine learning model stored in a model catalog through a single endpoint.csv-splitter
csv-splitterPython-recursion
This repository shows the implementation of the most common recursive algorithmsMultiprocessing
Improving the Performance in the Statistical Redistribution of Message Symbols using Architectural patterns for Parallel Programmingcode_challenges
Scripts for different purposesburrows-wheeler-transform
Implementation of the algorithm "Burrows Wheeler Transform" in python for data compressionLove Open Source and this site? Check out how you can help us