• Stars
    star
    117
  • Rank 291,100 (Top 6 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Stream Discovery and Stream Orchestration

stream-registry Build Status

StreamRegistryLogo

Announcement: 12th April 2019

We wanted to let you know that there are going to be some exciting developments with the Stream Registry project in the very near future. Stream Registry is being adopted by many brands at Expedia Group as a critical component of its digital nervous system for key streams across Expedia Group. Therefore, Vrbo stream registry is finding a new home.

What is changing

  • We will be investing in the project by expanding the existing team with full-time resources in several locations across Expedia Group. Expect greatly increased project activity: contributors, commits, issues, features, releases
  • The repository will relocate to the ExpediaGroup open source GitHub org in its entirety, preserving the history and community

What isn't changing

  • The original vision of Stream Registry as a Stream Discovery and Stream Orchestration platform
  • The project will remain open source, and will be joined shortly by other supporting Expedia Group stream platform components
  • Licenses, conduct and contribution guidelines will remain unchanged
  • The value of your contributions - please keep them coming!

We expect the start of this journey to be a little bumpy, but please bear with us as we work towards the first release of the Expedia Group Stream Registry!

About

A Stream Registry is what its name implies: it is a registry of streams. As enterprises increasingly scale in size, the need to organize and develop around streams of data becomes paramount. Synchronous calls are attracted to the edge, and a variety of synchronous and asynchronous calls permeate the enterprise. The need for a declarative, central authority for discovery and orchestration of stream management emerges. This is what a stream registry provides. In much the same way that DNS provides a name translation service for an ip address, by way of analogy, a Stream Registry provides a “metadata service” for streams. By centralizing stream metadata, a stream translation service for producer and/or consumer stream coördinates becomes possible. This centralized, yet democratized, stream metadata function thus streamlines operational complexity via stream lifecycle management, stream discovery, stream availability and resiliency.

Why Stream Registry?

We believe that as the change to business requirements accelerate, time to market pressures increase, competitive measures grow, migrations to cloud and different platforms are required, and so on, systems will increasingly need to become more reactive and dynamic in nature.

The issue of state arises.

We see many systems adopting event-driven-architectures to facilitate the changing business needs in these high stakes environments. We hypothesize there is an emerging need for a centralized "stream metadata" service in the industry to help streamline the complexities and operations of deploying stream platforms that serve as a distributed federated nervous system in the enterprise.

What is Stream Registry?

Put simply, Stream Registry is a centralized service for stream metadata.

The stream registry can answer the following question:

  1. Who owns the stream?
  2. Who are the producers and consumers of the stream?
  3. Management of stream replication across clusters and regions
  4. Management of stream storage for permanent access
  5. Management of stream triggers for legacy stream sources

Architecture

StreamRegistryArchitecture

See the architecture/northstar documentation for more details.

Building locally

Stream Registry is built using OpenJDK 11 and Maven.

Stream Registry is currently packaged as a shaded JAR file. We leave specific deployment considerations up to each team since this varies from enterprise to enterprise.

To build Stream Registry as a JAR file, please run

./mvnw clean package

Start Stream Registry

Required Local Environment
The local 'dev' version of Stream Registry requires a locally running version of Apache Kafka and Confluent's Schema Registry on ports 9092 and 8081, respectively.

To quickly get a local dev environment set up, we recommend to use Docker Compose.

Alternatively, one can start Confluent Platform locally after downloading the Confluent CLI and running the following command. Note: The confluent command is currently only available for macOS and Linux. If using Windows, you'll need to use Docker, or run ZooKeeper, Kafka, and the Schema Registry all individually.

confluent start zookeeper
confluent start kafka
confluent start schema-registry

Stream Registry can then be started.

Once Stream Registry has started, check that the application's GraphiQL server is running at http://localhost:8080/graphiql

Kafka Version Compatibility

Stream Registry development and initial deployment started with Kafka 0.11.0 / Confluent Platform 3.3.0, and has also been deployed against Kafka 1.1.1 / Confluent Platform 4.1.2.
As per the Kafka Compatibility Matrix, we expect Stream Registry to be compatbile with Kafka 0.10.0 and newer, and the internal Java Kafka clients used by Stream Registry can be found in the pom.xml.

Run Unit Tests

./mvnw clean test

Contributors

Special thanks to the following for making stream-registry possible at Vrbo and beyond!

Adam Westerman
Adam Westerman

💻
Arun Vasudevan
Arun Vasudevan

💻 🎨
Nathan Walther
Nathan Walther

💻 👀
Jordan Moore
Jordan Moore

💻 💁
Carlos Cordero
Carlos Cordero

💻
Ishan Dikshit
Ishan Dikshit

💻 📖
Vinayak Ponangi
Vinayak Ponangi

💻 📢 🎨 👀
Prabhakaran Thatchinamoorthy
Prabhakaran Thatchinamoorthy

💻 🎨
Rui Zhang
Rui Zhang

💻
Miguel Lucero
Miguel Lucero

💻 💁
René X Parra
René X Parra

💻 📖 📝 📢 🎨 👀

This project follows the all-contributors specification.

Legal

This project is available under the Apache 2.0 License.

Copyright 2018-2019 Expedia, Inc.

More Repositories

1

graphql-kotlin

Libraries for running GraphQL in Kotlin
Kotlin
1,710
star
2

cyclotron

A web platform for constructing dashboards.
CoffeeScript
1,560
star
3

waggle-dance

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Java
258
star
4

styx

Programmable, asynchronous, event-based reverse proxy for JVM.
Java
249
star
5

adaptive-alerting

Anomaly detection for streaming time series, featuring automated model selection.
Java
200
star
6

jenkins-spock

Unit-test Jenkins pipeline code with Spock
Groovy
187
star
7

bull

BULL - Bean Utils Light Library
Java
182
star
8

c3vis

Visualize the resource utilisation of Amazon ECS clusters
JavaScript
163
star
9

jarviz

Jarviz is dependency analysis and visualization tool designed for Java applications
Java
119
star
10

flyte

Flyte binds together the tools you use into easily defined, automated workflows
Go
88
star
11

circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Java
87
star
12

kubernetes-sidecar-injector

Kuberbetes mutating webhook that injects a sidecar container to a pod
Go
74
star
13

mittens

Warm-up routine for http applications over REST and gRPC
Go
62
star
14

avro-compatibility

A user friendly API for checking for and reporting on Avro schema incompatibilities.
Java
56
star
15

expediagroup.github.io

The Expedia Group Open Source portal, a website for discovering EG open source projects.
JavaScript
53
star
16

graphql-component

Composeable graphql components
JavaScript
53
star
17

heat

Heat Test Framework
Java
46
star
18

beekeeper

Service for automatically managing and cleaning up unreferenced data
Java
45
star
19

pitchfork

Convert tracing data between Zipkin and Haystack formats
Java
44
star
20

apiary

Apiary provides modules which can be combined to create a federated cloud data lake
35
star
21

pino-rotating-file

[DEPRECATED] A pino log transport for splitting logs into separate, automatically rotating files.
JavaScript
32
star
22

github-helpers

A collection of Github Actions that simplify and standardize common CI/CD workflow tasks.
TypeScript
30
star
23

vsync

Sync Secrets between HashiCorp vaults
Go
29
star
24

javro

JSON Schema to Avro Mapper
JavaScript
28
star
25

rhapsody

Reactive Streams framework with support for at-least-once processing
Java
28
star
26

plunger

A unit testing framework for the Cascading data processing platform.
Java
26
star
27

kube-graffiti

Paint your kubernetes objects with 'mutating' webhooks
Go
26
star
28

hiveberg

Demonstration of a Hive Input Format for Iceberg
Java
26
star
29

beeju

JUnit integration for testing the Apache Hive Metastore and HiveServer2 Thrift APIs
Java
24
star
30

react-event-tracking

React shared context utilities for analytic event tracking.
JavaScript
23
star
31

expediagroup.github.io-old

Expedia Group OSS Portal
HTML
23
star
32

shunting-yard

Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
Java
20
star
33

jasvorno

A library for strong, schema based conversion between 'natural' JSON documents and Avro
Java
18
star
34

apiary-data-lake

Terraform scripts for deploying Apiary Data Lake
HCL
18
star
35

datasqueeze

Hadoop utility to compact small files
Java
18
star
36

hello-streams

hello-streams :: Introducing the stream-first mindset
Java
16
star
37

container-startup-autoscaler

A Kubernetes controller that modifies the CPU and/or memory resources of containers depending on whether they're starting up, according to the startup/post-startup settings you supply.
Go
16
star
38

map-maker

Map maker is a command line tool and library for easily generating maps from structured data.
Jupyter Notebook
15
star
39

spinnaker-pipeline-trigger

Pipeline trigger for Spinnaker utilizing SNS
TypeScript
15
star
40

corc

An ORC File Scheme for the Cascading data processing platform.
Java
14
star
41

steerage

[DEPRECATED] Hapi server configuration and composition using confidence, topo, and shortstop.
JavaScript
14
star
42

fpsmeter

Optimized javascript utility for measuring frames per second in a browser environment. Useful for observing end-user client run-time performance without adversly impacting performance.
JavaScript
14
star
43

insights-explorer

Insights Explorer is a tool to catalogue and present analytical & research work.
TypeScript
13
star
44

service-client

[DEPRECATED] A general purpose http client built with extensibility in mind. It also features lifecycle hooks, dynamic hostname resolution, and circuit breaking.
JavaScript
12
star
45

molten

Molten is an opinionated library providing reactive tooling to simplify building production-ready integration solutions using Reactor.
Java
12
star
46

apiary-extensions

Extensions available for use in Apiary
Java
10
star
47

drone-fly

A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service
Java
10
star
48

github-webhook-proxy

Request forwarder for GitHub webhooks from github.com to internal enterprise destinations, designed for use in Github Enterprise Cloud.
TypeScript
9
star
49

secrets-injector

Go
9
star
50

neaps

a simulator to forecast the end of agile project basing on historical data and using montecarlo simulations
JavaScript
9
star
51

cypress-codegen

A Cypress plugin which automatically adds and enables IntelliSense for your custom commands!
TypeScript
9
star
52

circus-train-bigquery

Circus Train plugin which replicates BigQuery tables to Hive
Java
8
star
53

quibble

Data validator tool to allows testers, developers and analysts to define and execute test-cases involving data. Quibble is able to compare data from one or more data platforms, assert on the outcome and produce generated report output on any anomalies in data.
Java
8
star
54

dr-shadow

Dr Shadow is a library developed by Egencia (part of Expedia Group) that enables shadow traffic (ie. mirroring). It is a valuable tool for having good hygiene for service operations (ie. testing, resiliency, performance).
Java
6
star
55

hello-cloud

hello world example for Multicloud applications
HTML
6
star
56

flyte-client

A Go library designed to make the writing of flyte packs simple
Go
6
star
57

catalyst-server

[DEPRECATED] Configuration and composition management for Hapi.js applications.
JavaScript
6
star
58

aws-adfs-login

Library for user login (client side) using AWS ADFS (Active Directory Federation Service)
Go
5
star
59

comparadise

A visual comparison tool for reviewing visual changes on frontend PRs.
TypeScript
5
star
60

apiary-metastore-docker

Docker image for Apiary Data Lake metastore
Shell
5
star
61

apiary-ranger-docker

Docker image for Apiary Data Lake Ranger
Shell
4
star
62

flyte-jira

An Atlassian Jira integration pack for Flyte
Go
4
star
63

kafka-consumer-sns-sqs

Kafka Consumer for AWS SNS/SQS
Python
4
star
64

catalyst-render

[DEPRECATED] A hapi js plugin that works with catalyst-server to provide server-side rendering with react inside a handlebars template
JavaScript
4
star
65

expediagroup-python-sdk

Open World SDK for Python
Python
4
star
66

nimbuild

A suite of build tools that enable ultra fast web bundling at run-time.
JavaScript
4
star
67

circus-train-datasqueeze

Circus Train ⨉ DataSqueeze
Java
4
star
68

apiary-federation

Terraform scripts for deploying Apiary Data Lake federation
HCL
4
star
69

data-highway

Java
4
star
70

two-tower-lodging-candidate-generation

Python
4
star
71

flyte-bamboo

An Atlassian Bamboo integration pack for Flyte
Go
4
star
72

housekeeping

Common functionality for managing and cleaning up orphaned paths
Java
3
star
73

pkdd22-challenge-expediagroup

Expedia Group ECML/PKDD 2022 challenge
Python
3
star
74

icf

Independent connectivity forum API and tools
3
star
75

expediagroup-java-sdk

Open World SDK for Java.
Kotlin
3
star
76

hcommon-hive-metastore

General purpose libraries for interacting with the HiveMetaStore
Java
3
star
77

flyte-shell

Run shell scripts in your Flyte flows with this integration pack
Go
3
star
78

apiary-lifecycle

Terraform deployment scripts for Beekeeper
HCL
3
star
79

apiary-authorization

Authorization for Apiary Data Lake
HCL
3
star
80

flyte-serf

A Hashicorp Serf integration pack for Flyte
Go
3
star
81

network-plugin

The Network plugin allows developers to proxy requests and view the request and responses in IntelliJ.
Kotlin
3
star
82

a11y-tools

Client side A11y tools for trapping and tracking user focus.
JavaScript
2
star
83

flyte-ldap

An LDAP integration pack for Flyte
Go
2
star
84

new-project

This repository contains a template you can use to seed a repository for a new open source project.
2
star
85

expediagroup-nodejs-sdk

Expedia Group SDK for Node.js
TypeScript
2
star
86

apiary-drone-fly

Terraform scripts for deploying Drone Fly
HCL
2
star
87

dropwizard-resilience4j-bundle

Integration of Resilience4J into Dropwizard
Java
2
star
88

expediagroup-java-sdk-parent

2
star
89

parsec

Parsec is a data processing engine for interpreted queries.
Clojure
2
star
90

renovate-config-catalyst

[DEPRECATED] Renovate shared configuration for catalyst projects
2
star
91

flyte-slack

A Slack integration pack for Flyte
Go
2
star
92

openworld-sdk-java-generators

Mustache
2
star
93

flyte-graphite

A graphite integration pack for Flyte
Go
2
star
94

overwhelm

Operator for complex application deployment on Kubernetes
Go
2
star
95

package-json-validator

A Github Action for validating package.json conventions.
TypeScript
2
star
96

graphql-kotlin-codegen

A graphql-codegen plugin that enables type generation for GraphQL Kotlin services, promoting schema-first development.
TypeScript
2
star
97

helm-charts

Expedia Group Helm Charts
Mustache
1
star
98

determination

[DEPRECATED] Configuration resolver using confidence and shortstop.
JavaScript
1
star
99

dr-squid

Dr Squid is a downstream services and databases mocking tool primarily used for chaos testing and gathering performance metrics for Java Spring service
Java
1
star
100

spec-transformer

The API Spec Transformer Library
TypeScript
1
star