Laurence Geng (@bluishglc)

Top repositories

1

bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Java
176
star
2

serverless-datalake-example

A serverless datalake project and framework based on AWS S3,Glue,Athena,MWAA and QuickSight. With a series of best practices, it guides you how to build a serverless datalake.
Shell
17
star
3

apache-hudi-core-conceptions

A set of notebooks to explore and explain core conceptions of Apache Hudi, such as file layouts, file sizing, compaction, clustering and so on.
Jupyter Notebook
7
star
4

emr-edgenode-maker

This tool can easily make / build an emr cluster edge node / client node / gateway node
Shell
7
star
5

ranger-emr-cli-installer

This is a powerful cli tool for Apache Ranger and AWS EMR automated installation & integration with OpenLDAP & Windows AD. It supports Open-Source Ranger and EMR-Native Ranger both, supports OpenLDAP & Windows AD both, and works in all AWS regions (also including China regions).
Shell
7
star
6

glue-hudi-integration-example

An example project to demo how Glue read and write hudi dataset, and also sync metadata to Glue Catalog.
Scala
5
star
7

ranger-emr-cfn-installer

This project is a series of aws cloudformation templates which are used to install ranger and integrate a AWS EMR cluster and a windows AD or Open LDAP server as authentication channel.
2
star
8

jmeter-demo

1
star
9

aws-cli-plus

This command line tool is a useful complement to aws-cli. It offers a suite of utilities that manages and operates ec2, emr and other aws services.
Shell
1
star