Dan Zaratsian (@zaratsian)

Top repositories

1

Spark

Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References
Jupyter Notebook
71
star
2

SparkHBaseExample

Spark code to analyze HBase Snapshots
Scala
35
star
3

network_topology_analysis

Code to collect and analyze traceroute data within a network topology
Scala
25
star
4

HDP_Tuning_Unofficial

Collection of HDP Tuning Tricks & Tips (unofficial guide)
Python
17
star
5

SparkPhoenix

Spark Example using Phoenix to interact with HBase
Scala
14
star
6

dynamic_time_warping

Spark (PySpark) script that applies dynamic time warping to Energy usage data (using the python fastdtw package)
Python
13
star
7

Datasets

Interesting Public Datasets
11
star
8

Apache_NiFi

Code, projects, and references for Apache NiFi
Python
10
star
9

Google-Cloud-Scripts

Google Cloud Platform Scripts
Python
9
star
10

docker_containers

Docker Containers with HDP Services/Code (Spark, Kafka, NiFi, Solr, Tensorflow...)
JavaScript
6
star
11

DL_Image_Classification

Deep Learning Image Classification - Scripts and Links
JavaScript
5
star
12

iaa-2023

Institute for Advanced Analytics, 2023
Jupyter Notebook
5
star
13

Apache_Hive

Apache Hive (SQL on Hadoop) Syntax, Cheatsheet, and Projects
Python
4
star
14

iaa-2022

Institute for Advanced Analytics, 2022
Jupyter Notebook
4
star
15

iaa_2020

Institute for Advanced Analytics - 2020
Python
3
star
16

hive_udf

Apache Hive - UDF Example with Python
Python
3
star
17

Hortonworks_Hackathon_Ad_Server

Hortonworks Hackathon - Ad Server Assets
JavaScript
3
star
18

python

Python Scripts, Tricks, and References
Python
2
star
19

nfl_predictions

NFL Predictions (WebApp with PySpark)
JavaScript
2
star
20

iaa_2021

Institute for Advanced Analytics 2021
Jupyter Notebook
2
star
21

Disaster_Recovery

Resources, tricks, and recommendations for DR (Disaster Recovery) Hadoop clusters
Shell
1
star
22

Sqoop

Sqoop - Bulk Load Data into HDFS, Hive, HBase, etc.
1
star
23

sas_esp

SAS Event Stream Processing
Python
1
star
24

Apache-Ranger

Hadoop Security and Policy Management - Syntax, Tricks, and Resources
Python
1
star
25

cloud-endpoints

Cloud Run API Backend
Shell
1
star
26

gcp_dataflow

Google Dataflow - Scripts and References
Python
1
star
27

gcp-data-streaming

Google Cloud Data Streaming Architecture
Python
1
star
28

HBase_Phoenix

Apache HBase & Phoenix Scripts and Code Examples
Shell
1
star
29

sas

SAS Scripts
SAS
1
star
30

Google-ML

Google Machine Learning Script and Assets
Python
1
star
31

video-processing

GCP Video Processing with Speech to Text
HTML
1
star
32

ML-Model-Deployment

Scripts, Tips, and Tricks for Deploying ML and Deep Learning Models into Production
Shell
1
star
33

Hortonworks_Installation

Hortonworks DataFlow (HDF) Installation/Config, Scripts, and Tricks
Shell
1
star
34

GPUs_Tensorflow

GPU Scripts to setup and run Tensorflow (and other DL/ML libraries)
Shell
1
star
35

hortonworks_hdf_workshop

Hortonworks HDF Workshop Vagrant Image
1
star
36

video_analysis

Real-time Video Analysis, Object Detection
Python
1
star
37

Cloud-DevOps

Google Cloud DevOps Scripts and References
Python
1
star
38

genai-text-to-3d-mesh

Kubernetes deployment for text to 3d Mesh using Open AI Point-e
HCL
1
star
39

Apache-Atlas

Hadoop Data Lineage and Metatdata - Configuration, Scripts, and Tricks
Python
1
star