• Stars
    star
    420
  • Rank 99,273 (Top 3 %)
  • Language
    TypeScript
  • Created almost 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Data Explorer gives you fast, safe access to data stored in Cassandra, Dynomite, and Redis.

Netflix Data Explorer

The Netflix Data Explorer tool allows users to explore data stored in several popular datastores (currently Cassandra, Dynomite, and Redis).

Quick Start

To help get you started, we have provided a demo environment that you can run via Docker. You'll need to run a couple of commands to get up and running (install Yarn first if you don't already have it).

Note: if using Apple silicon, you will need to run the Terminal using Rosetta.

yarn
yarn docker:demo

This will run a production build of the app, a Cassandra instance, and a Redis instance, all locally.

Docker PS

Note: the first time you run this command, it will take some time to pull down the images and run a complete build. Once the app starts, your browser will open the app. You may also need to wait a minute while C* and Redis startup. Run the command, and then grab some ☕️.

After the first invocation, future invocations of yarn docker:demo will be much faster.

Also note: If you see a "No Hosts Available" error, it's likely due to C* still starting up.

Docker Configuration

To run the demo environment, you will need to allocate 2GB of memory to Docker. This is easily done if using Docker for Windows or Mac by adjusting the slider on the Resources tab of the Preferences page. Note, 2GB is also the current default for Docker for Window or Mac.

You can also run docker system info | grep Memory to view your allocated memory if you installed Docker via some other means (e.g. VirtualBox/brew/etc). Please see the documentation for your particular Docker installation on how to change these settings.

Developing/Contributing

If you are thinking about contributing, please be sure to check out Contributing Guidelines.

Description

The Netflix Data Explorer strives to be a turn-key solution for connecting to Cassandra and Dynomite/Redis datastores. It was developed for internal use at Netflix to codify some of our best practices and help our engineers quickly access their data.

We have provided integration hooks so you can use the Data Explorer in your environment. Since all environments are unique, many configuration overrides can be specified to adapt the app for your particular use case. For instance, you might have clusters with C* authentication enabled, or clusters that are discovered by polling a REST service, etc. We've provided seams so you can integrate accordingly.

Custom Environments

If you want to experiment connecting to Cassandra or Redis clusters in your environment (not using the provided Docker environment), you will need to generate an overridden config file.

// example custom config file

// only support C* (no Redis)
export const SUPPORTED_DATASTORE_TYPES = ['cassandra'];

// discovery settings
export const DISCOVERY_PROVIDER = 'FileSystemDiscoveryProvider'; // read a file to get information about available clusters
export const DISCOVERY_PROVIDER_FILESYSTEM_SOURCE = 'discovery.json'; // provides cluster discovery information
export const ENVIRONMENTS = ['test'];
export const REGIONS = ['us-east-1'];

// disable C* authentication
export const CASSANDRA_BASE_AUTH_PROVIDER_USERNAME = undefined;
export const CASSANDRA_BASE_AUTH_PROVIDER_PASSWORD = undefined;

// hypothetical test environment uses a custom port
export const CASSANDRA_PORT = 7199;

// use a custom class to provide C* connection options
export const CASSANDRA_CLIENT_OPTIONS_PROVIDER =
  'CustomCassandraClientOptionsProvider';

Generating a custom config file via the CLI

While a configuration override file can be crafted by hand, we recommend using the provided CLI tool to help you generate it. Please follow all prompts from the tool.

yarn # only required if you haven't run yarn up to this point
yarn setup

Yarn Setup

Once your config file is generated, the CLI will print the startup commands for you.

Yarn Setup Startup

Using the custom config file

Once you have a custom config file created, you have a few options on how you can use it:

  • Demo mode
  • Dev mode
  • Production mode

Demo

If you want to use this custom config file with the Data Explorer docker demo image (e.g., running the Data Explorer in Docker, but using a config file that points to C* clusters in your network), you will need to update the .env file in the project root.

# .env file
DATA_EXPLORER_CONFIG_NAME=my-custom-config

Once you've updated the variable, you can re-run the command:

yarn docker:demo

Dev

To run in Dev mode directly from source, you can run yarn dev.

If you have a config file created, it will be used automatically. If you have multiple config files, the yarn dev command will pause and prompt you to choose one - this is handy if you are switching between environments while developing.

# install dependencies
yarn

# run a local dev server for the UI, start a node server in watch mode, and start a local C* and Redis cluster
yarn dev

# optional: tail the local C* and Redis cluster logs so you can see what the server is doing
yarn docker:taillogs

The UI will be available at http://localhost:3000. Please note the 3000 port. A WebPack Dev Server serves the UI, and requests are proxied to Node app running on port 80.

Production

To run in Production mode, run the following commands. In Production mode, you will need to export the variable DATA_EXPLORER_CONFIG_NAME to use your file.

# install dependencies
yarn

# build for production
yarn build

# start the app using your config file
export DATA_EXPLORER_CONFIG_NAME=my-custom-config && yarn start

The UI will be available at http://localhost.

Re-building the Docker image

The yarn docker:demo command will start C*, Redis, and a dockerized version of the Data Explorer app. If you make changes to the source and want to rebuild the Data Explorer docker image, simply run:

yarn docker:build

Features

Cassandra Features

Here are a sampling of some of the features available in the Netflix Data Explorer.

Multi-Cluster Access

Multi-cluster access provides easy access to all of the clusters in your environment. The cluster selector in the top nav allows you to switch to any of your discovered clusters quickly.

Cluster Selector

Explore Your Data

The Explore view provides a simple way to explore your data quickly. You can query by partition and clustering keys, insert and edit records, and easily export the results or download them as CQL statements.

Explore View

You can also query and decode binary data.

Binary Data Support

Schema Designer

Creating a new Keyspace and Table by hand can be error-prone

Our schema designer UI streamlines creating a new Table with improved validation and enforcement of best practices.

Schema Designer

Query IDE

The Query Mode provides a powerful IDE-like experience for writing free-form CQL queries.

Query IDE

Dynomite and Redis Features

Key Scanning

Browsing through keys in an in-memory database like Redis can cause performance problems if you attempt to use the KEYS command. Instead, it's recommended that you perform a cursor-based SCAN. The Data Explorer will perform SCANS for you in both Redis and Dynomite environments.

Dynomite Scan

Key Edits

Redis has support for rich data structures like lists, maps, and sorted sets. In addition, the Data Explorer supports viewing, creating, editing, and deleting these entities.

Editing Keys

More Repositories

1

Hystrix

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
Java
23,594
star
2

chaosmonkey

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
Go
14,410
star
3

zuul

Zuul is a gateway service that provides dynamic routing, monitoring, resiliency, security, and more.
Java
12,993
star
4

conductor

Conductor is a microservices orchestration engine.
Java
12,920
star
5

eureka

AWS Service registry for resilient mid-tier load balancing and failover.
Java
11,991
star
6

falcor

A JavaScript library for efficient data fetching
JavaScript
10,338
star
7

pollyjs

Record, Replay, and Stub HTTP Interactions.
JavaScript
10,184
star
8

SimianArmy

Tools for keeping your cloud operating in top form. Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
Java
7,955
star
9

metaflow

🚀 Build and manage real-life ML, AI, and data science projects with ease!
Python
7,498
star
10

fast_jsonapi

No Longer Maintained - A lightning fast JSON:API serializer for Ruby Objects.
Ruby
5,078
star
11

dispatch

All of the ad-hoc things you're doing to manage incidents today, done for you, and much more!
Python
4,548
star
12

ribbon

Ribbon is a Inter Process Communication (remote procedure calls) library with built in software load balancers. The primary usage model involves REST calls with various serialization scheme support.
Java
4,468
star
13

security_monkey

Security Monkey monitors AWS, GCP, OpenStack, and GitHub orgs for assets and their changes over time.
Python
4,349
star
14

vmaf

Perceptual video quality assessment based on multi-method fusion.
Python
4,159
star
15

dynomite

A generic dynamo implementation for different k-v storage engines
C
4,104
star
16

vizceral

WebGL visualization for displaying animated traffic graphs
JavaScript
4,047
star
17

vector

Vector is an on-host performance monitoring framework which exposes hand picked high resolution metrics to every engineer’s browser.
JavaScript
3,588
star
18

atlas

In-memory dimensional time series database.
Scala
3,331
star
19

concurrency-limits

Java
3,117
star
20

consoleme

A Central Control Plane for AWS Permissions and Access
Python
3,065
star
21

flamescope

FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.
Python
2,979
star
22

dgs-framework

GraphQL for Java with Spring Boot made easy.
Kotlin
2,963
star
23

bless

Repository for BLESS, an SSH Certificate Authority that runs as a AWS Lambda function
Python
2,722
star
24

archaius

Library for configuration management API
Java
2,435
star
25

asgard

[Asgard is deprecated at Netflix. We use Spinnaker ( www.spinnaker.io ).] Web interface for application deployments and cloud management in Amazon Web Services (AWS). Binary download: http://github.com/Netflix/asgard/releases
Groovy
2,235
star
26

curator

ZooKeeper client wrapper and rich ZooKeeper framework
Java
2,138
star
27

titus

1,996
star
28

EVCache

A distributed in-memory data store for the cloud
Java
1,900
star
29

lemur

Repository for the Lemur Certificate Manager
Python
1,651
star
30

bpftop

bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.
Rust
1,647
star
31

genie

Distributed Big Data Orchestration Service
Java
1,635
star
32

metacat

Java
1,555
star
33

netflix.github.com

HTML
1,419
star
34

servo

Netflix Application Monitoring Library
Java
1,408
star
35

mantis

A platform that makes it easy for developers to build realtime, cost-effective, operations-focused applications
Java
1,385
star
36

vectorflow

D
1,287
star
37

hubcommander

A Slack bot for GitHub organization management -- and other things too
Python
1,262
star
38

rend

A memcached proxy that manages data chunking and L1 / L2 caches
Go
1,174
star
39

hollow

Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.
Java
1,148
star
40

repokid

AWS Least Privilege for Distributed, High-Velocity Deployment
Python
1,084
star
41

astyanax

Cassandra Java Client
Java
1,034
star
42

Priam

Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.
Java
1,024
star
43

aminator

A tool for creating EBS AMIs. This tool currently works for CentOS/RedHat Linux images and is intended to run on an EC2 instance.
Python
938
star
44

Turbine

SSE Stream Aggregator
Java
831
star
45

governator

Governator is a library of extensions and utilities that enhance Google Guice to provide: classpath scanning and automatic binding, lifecycle management, configuration to field mapping, field validation and parallelized object warmup.
Java
821
star
46

Fido

C#
816
star
47

suro

Netflix's distributed Data Pipeline
Java
783
star
48

security-bulletins

Security Bulletins that relate to Netflix Open Source
734
star
49

spectator

Client library for collecting metrics.
Java
720
star
50

Fenzo

Extensible Scheduler for Mesos Frameworks
Java
703
star
51

msl

Message Security Layer
C++
687
star
52

unleash

Professionally publish your JavaScript modules in one keystroke
JavaScript
588
star
53

denominator

Portably control DNS clouds using java or bash
Java
573
star
54

blitz4j

Logging framework for fast asynchronous logging
Java
559
star
55

edda

AWS API Read Cache
Scala
554
star
56

PigPen

Map-Reduce for Clojure
Clojure
551
star
57

netflix-graph

Compact in-memory representation of directed graph data
Java
548
star
58

go-env

a golang library to manage environment variables
Go
542
star
59

karyon

The nucleus or the base container for Applications and Services built using the NetflixOSS ecosystem
Java
495
star
60

Prana

A sidecar for your NetflixOSS based services.
Java
492
star
61

iceberg

Iceberg is a table format for large, slow-moving tabular data
Java
465
star
62

Lipstick

Pig Visualization framework
JavaScript
464
star
63

Surus

Java
453
star
64

aws-autoscaling

Tools and Documentation about using Auto Scaling
Shell
429
star
65

go-expect

an expect-like golang library to automate control of terminal or console based programs.
Go
422
star
66

Workflowable

Ruby
370
star
67

osstracker

Github organization OSS metrics collector and metrics dashboard
Scala
365
star
68

vizceral-example

Example Vizceral app
JavaScript
363
star
69

ndbench

Netflix Data Store Benchmark
HTML
360
star
70

Raigad

Co-Process for backup/recovery, Auto Deployments and Centralized Configuration management for ElasticSearch
Java
346
star
71

recipes-rss

RSS Reader Recipes that uses several of the Netflix OSS components
Java
339
star
72

aegisthus

A Bulk Data Pipeline out of Cassandra
Java
323
star
73

titus-control-plane

Titus is the Netflix Container Management Platform that manages containers and provides integrations to the infrastructure ecosystem.
Java
316
star
74

weep

The ConsoleMe CLI utility
Go
311
star
75

metaflow-ui

🎨 UI for monitoring your Metaflow executions!
TypeScript
300
star
76

dyno-queues

Dyno Queues is a recipe that provides task queues utilizing Dynomite.
Java
264
star
77

image_compression_comparison

Image Compression Comparison Framework
Python
258
star
78

falcor-express-demo

Demonstration Falcor end point for a Netflix-style Application using express
HTML
246
star
79

gradle-template

Java
244
star
80

ember-nf-graph

Composable graphing component library for EmberJS.
JavaScript
241
star
81

falcor-router-demo

A demonstration of how to build a Router for a Netflix-like application
JavaScript
236
star
82

titus-executor

Titus Executor is the container runtime/executor implementation for Titus
Go
233
star
83

photon

Photon is a Java implementation of the Interoperable Master Format (IMF) standard. IMF is a SMPTE standard whose core constraints are defined in the specification st2067-2:2013
Java
233
star
84

dial-reference

C
228
star
85

s3mper

s3mper - Consistent Listing for S3
Java
218
star
86

ReactiveLab

Experiments and prototypes with reactive application design.
Java
209
star
87

inviso

JavaScript
205
star
88

NfWebCrypto

Web Cryptography API Polyfill
C++
205
star
89

staash

A language-agnostic as well as storage-agnostic web interface for storing data into persistent storage systems, the metadata layer abstracts a lot of storage details and the pattern automation APIs take care of automating common data access patterns.
Java
204
star
90

zeno

Netflix's In-Memory Data Propagation Framework
Java
200
star
91

brutal

A multi-network asynchronous chat bot framework using twisted
Python
200
star
92

vizceral-react

JavaScript
199
star
93

dispatch-docker

Shell
193
star
94

pytheas

Web Resources and UI Framework
JavaScript
187
star
95

dyno

Java client for Dynomite
Java
184
star
96

hal-9001

Hal-9001 is a Go library that offers a number of facilities for creating a bot and its plugins.
Go
178
star
97

metaflow-service

🚀 Metadata tracking and UI service for Metaflow!
Python
173
star
98

Nicobar

Java
171
star
99

lemur-docker

Docker files for the Lemur certificate orchestration tool
Python
170
star
100

yetch

Yet-another-fetch polyfill library. Supports AbortController/AbortSignal
JavaScript
168
star