• Stars
    star
    1
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Framework to automate ETL pipeline creation with a touch of AI.

More Repositories

1

airflow-maintenance-dags

A series of DAGs/Workflows to help maintain the operation of Airflow
Python
1,670
star
2

airflow-rest-api-plugin

A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces
Python
325
star
3

airflow-scheduler-failover-controller

A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability
Python
232
star
4

hadoop-deployment-bash

Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
Shell
34
star
5

apache-airflow-cloudera-csd

CSD for Apache Airflow
Shell
20
star
6

airflow_demo

Airflow script for incremental data import from Mysql to Hive using Sqoop.
Java
18
star
7

apache-airflow-cloudera-parcel

Parcel for Apache Airflow
Dockerfile
17
star
8

jenkins-workspace-cleanup-groovy-script

Jenkins Workspace Cleanup script to automate folders clean up for all the jobs
Groovy
16
star
9

airflow-user-management-plugin

A plugin for Apache Airflow that allows you to manage the users that can login
Python
14
star
10

hadoop-smoke-tests

Basic smoke tests to determine component functionality of a Hadoop cluster.
8
star
11

terraform-hadoop-talk

Set up the AWS infrastructure for a small Hadoop cluster as well as install the Cloudera Manager server and agents.
HCL
6
star
12

airflow-plugins

A series of Plugins for Apache Airflow (https://airflow.incubator.apache.org/)
Python
5
star
13

intro-to-spark

Java
3
star
14

NameDatabases

List of public, open source Name Databases
3
star
15

cdp-azure

Bits and pieces to make it easy to set up CDP on Azure
HCL
2
star
16

MongoDB_OPSLOG

Python
2
star
17

SparkCluster_Ansible

Shell
2
star
18

database-comparison-tool

Java
2
star
19

spark-streaming-workshop

Java
2
star
20

nagios-plugins

Plugins built for Nagios
Python
2
star
21

saleor-storefront-poc

Customizing Saleor storefront to add more features and evaluate.
TypeScript
2
star
22

clairthon-ambivalent-aardvarks

TypeScript
1
star
23

rabbitmq-cloudera-parcel

RabbitMQ parcel to be deployed and managed through Cloudera Manager
Python
1
star
24

skills-base

Java
1
star
25

spark-workshop-2x

Java
1
star
26

automated-hadoop-smoke-test

Basic smoke tests to determine component functionality of a Hadoop cluster.
Shell
1
star
27

data-scalaxy-test-util

A scala library that provides additional utilities for testing spark applications.
Scala
1
star
28

spark-batch

Template repository for spark-batch
Java
1
star
29

GCP-serv

1
star
30

restonomer

Framework to ingest data from REST APIs, transform and persist the data.
Scala
1
star
31

snowflake-poc

Snowflake PoC
1
star
32

vagrant-sparkbuilder

Simple environment to help rebuild Cloudera's Apache Spark.
Puppet
1
star
33

auto-etl

Python
1
star
34

IntroToMachineLearning

Intro to machine learning - Code for article at http://blog.clairvoyantsoft.com/2015/03/intro-to-machine-learning/
Python
1
star