GetInData | Part of Xebia (@getindata)

Top repositories

1

flink-http-connector

Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Java
122
star
2

kafka-connect-iceberg-sink

Java
75
star
3

dbt-flink-adapter

Adapter for dbt that executes dbt pipelines on Apache Flink
Python
73
star
4

kedro-kubeflow

Kedro Plugin to support running workflows on Kubeflow Pipelines
Python
46
star
5

dbt-airflow-factory

Library to convert DBT manifest metadata to Airflow tasks
Python
44
star
6

kedro-azureml

Kedro plugin to support running workflows on Microsoft Azure ML Pipelines
Python
31
star
7

kedro-vertexai

Kedro Plugin to support running workflows on GCP Vertex AI Pipelines
Python
29
star
8

kedro-airflow-k8s

Kedro Plugin to support running pipelines on Kubernetes using Airflow.
Python
29
star
9

apache-nifi-kubernetes

Shell
21
star
10

datapill

Big Data Newsletter
18
star
11

data-pipelines-cli

CLI for data platform
Python
18
star
12

streaming-jupyter-integrations

Python
17
star
13

doge-datagen

Python
15
star
14

kedro-sagemaker

Kedro Plugin to support running pipelines on AWS SageMaker.
Python
14
star
15

quickstart-ml-blueprints

Data science project development best practices and state of the art open-source tooling forged into a set of solved ML use cases to serve as blueprints for efficient prototyping.
Jupyter Notebook
14
star
16

helm-charts

GetInData Helm Charts repository
Smarty
12
star
17

flink-dynamic-cep-demo

Flink dynamic CEP demo
Java
12
star
18

docker-atlantis

Custom Atlantis docker image developed by GetInData
Shell
12
star
19

awesome-getindata-recommended-sources

A curated list of links to sources of latest updates in data/ml/ai
10
star
20

kedro-snowflake

Kedro Snowflake / Snowpark plugin
Python
10
star
21

jupyterlab-mlflow-extension

TypeScript
9
star
22

first-steps-with-data-pipelines

Dockerfile
9
star
23

flink-spring

A library that allows using Spring dependency injection framework in Flink Jobs
Java
9
star
24

jupyter-images

Receipes of publicly-available Jupyter images
Shell
8
star
25

terraform-azurerm-atlantis

Terraform module for deploying Atlantis in Azure Container Group
HCL
8
star
26

terraform-snowflake-role

Terraform module for managing Snowflake role and grants
HCL
7
star
27

terraform-module-template

Terraform module template - boilerplate used to simplify creation of new Terraform modules
HCL
7
star
28

BigDataTutorial

Java
7
star
29

terraform-azurerm-storage-account

Terraform Module for Azure Storage Account
HCL
6
star
30

flink-use-case

Scala
6
star
31

gitlab_cicd_templates

The project contains templates for CICD processes
6
star
32

flink-influxdb-reporter

Java
5
star
33

mlflow-docker

Docker image for MLflow.
Dockerfile
5
star
34

streaming-cli

Python CLI for streaming platform
Python
5
star
35

quickstart-ml-starter

Kedro starterts to quickly set up new projects according to QuickStart ML Blueprints practice.
5
star
36

data-pipelines-template-example

The project contains an example of a template to create pipeline project with GetInData Framework based on DBT
Dockerfile
5
star
37

training-infra

Scripts setting up infrastructure for trainings
Python
4
star
38

terraform-aws-organization

Terraform module for AWS Organization management
HCL
4
star
39

flink-tutorial-old

Java
4
star
40

dbt-images

Dockerfile
4
star
41

terraform-null-atlantis-repo-config

Module for generating Atlantis repo config file. It contains set of custom workflows
HCL
4
star
42

kedro-starters

Kedro starters by GetInData
Python
3
star
43

kafka-avro-producer

Java
3
star
44

flink-elastic-catalog

Flink Catalog for Elasticsearch.
Java
3
star
45

terraform-snowflake-database

Terraform module for managing Snowflake database
HCL
3
star
46

feast-kafka-postgres-demo

Jupyter Notebook
3
star
47

flink-sql-runner

Framework for scheduling Flink SQL jobs on AWS Elastic MapReduce or a standalone Flink cluster.
Python
3
star
48

streaming-ml-with-flink

Demo of running SciKit model on Flink, using Mleap serialization
Scala
3
star
49

terraform-azurerm-mlflow

Module for deploying serverless MLflow instance on Azure, using Serverless SQL Server, Container App Service and Azure Blob Storage.
HCL
3
star
50

docker-image-template

Docker image template - boilerplate used to simplify creation of new docker images
Dockerfile
3
star
51

example-kedro-azureml-pytorch-distributed

This repository contains an example project showing how to run distributed PyTorch training on Azure ML pipelines with Kedro. See the related blogpost.
Python
2
star
52

kedro-airflow-gke-example

Example of how to use kedro-airflow plugin and GKE cluster/composer together.
Python
2
star
53

github-workflows

Collection of reusable GitHub Actions workflows
2
star
54

terraform-azurerm-public-ip

Terraform module for managing Azure Public IP
HCL
2
star
55

flink-tutorial

Java
2
star
56

terraform-snowflake-user

Terraform module for creating snowflake users
HCL
2
star
57

terraform-snowflake-warehouse

Terraform module for Snowflake Warehouse management
HCL
2
star
58

mlflow-appengine-terraform

Terraform module for deploying MLflow on Google Cloud AppEngine Flexible
HCL
2
star
59

looker-pre-commit

A set of pre-commit hooks for Looker
Dockerfile
2
star
60

flink-workshop-task

Java
2
star
61

tpc-h-data-pipelines-demo

Dockerfile
2
star
62

terraform-snowflake-privatelink-aws

Terraform module for Snowflake AWS PrivateLink management
HCL
2
star
63

data-formats-benchmark

Java
2
star
64

terraform-aws-budget

Terraform module to manage AWS Budgets
HCL
1
star
65

test-spark-app

Skeleton for Spark Application with HiveContext and tests
Scala
1
star
66

mlops-gcp-vertex-snowflake-dbt

The repository contains code samples from "MLOPs for Pro's - Technical perspective. Build a Feature Store Faster - an Introduction to Vertex AI, Snowflake and dbt Cloud" ebook.
HCL
1
star
67

gid-mdp-workshop

Repository with lab exercises and their solutions
1
star
68

streaming-ml-with-ksql

Demo of running Spark MLLib model on Kafka with KSQL, using Mleap serialization
Python
1
star
69

dp-framework

1
star
70

terraform-azurerm-container-group

Terraform Module for creating Azure Container Group
HCL
1
star
71

kafka-streams-avro

Example application for Kafka Streams training
Java
1
star
72

ververica-platform-flink-workshop

All tools for local Ververica Platform setup with Flink SQL
1
star
73

flink-python-loader

Java
1
star
74

dbt-intro

Introductory repository to dbt with the use of data-pipelines-cli Topics Resources
1
star
75

flink-beam-poc

Java
1
star
76

data-pipelines-cli-init-example

The example for init template for Data Pipelines CLI tool
1
star
77

mlflow-demo

mlflow-demo
Jupyter Notebook
1
star
78

terraform-azurerm-subscription

Terraform Module for Azure Subscription
HCL
1
star
79

dbt-workflows-factory

Creates dbt based GCP workflows.
Python
1
star
80

openlineage-bdtw-column-lineage

Jupyter Notebook
1
star
81

dbt-common-macros

It contains macros shared between projects.
1
star
82

kedro-pyspark-k8s-demo

Python
1
star
83

flink-ververica-catalog-proxy

Proxy to the internal Ververica Catalog via Ververica REST Api
Java
1
star
84

terraform-snowflake-schema

Terraform module for managing Snowflake schemas
HCL
1
star
85

terraform-gke-helm-release

GKE Helm release module
HCL
1
star
86

py-pre-commit-hooks

This is small repository that adds python based hooks for pre-commit
1
star