• Stars
    star
    218
  • Rank 181,805 (Top 4 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created about 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

s3mper - Consistent Listing for S3

S3mper

S3mper is a library that provides an additional layer of consistency checking on top of Amazon's S3 index through use of a consistent, secondary index.

Overview

S3mper leverages Aspect Oriented Programming and is implemented with AspectJ to advise implementations of the Hadoop FileSystem (primarily the NativeS3FileSystem implementation) with additional logic to crosscheck a secondary index for consistency.

The default implementation of the secondary index uses DynamoDB because of the speed, consistency, and availability guarantees that service provides. The table schema is designed to be light-weight and fast so as to not impair the performance of the file system.

There are two logical indexes1 used in the table structure to access path and file information. The first index is path based for lookup during listing operations. The second index is timeseries based so that entries can be expired from both indexes without having to scan the entire table.

Table Structure
Hash Key: path Range Key: file epoch deleted dir linkPath linkFile
//<bucket>/<path> <filename> <timestamp> <flag> <flag> N/A N/A
epoch2 <timestamp+entropy> N/A N/A N/A //<bucket>/<path> <filename>

The purpose of this table scheme is to provide range query operations for directory listings and for timeseries deletes.

Building

A gradle wrapper is used to build s3mper and can be run without additional tools. Edit the build.gradle to the appropriate version of hadoop and build with the following command.

$ ./gradlew release

This will produce the necessary jar files for use with hadoop in build/libs and a tar file with all dependencies for use with the admin tool.

Installing

Installation requires the following steps:

  • Installing libraries on client and cluster hosts
  • Modifying the hadoop configuration to enable s3mper

Library Installation

The three jar files from the build/libs directory need to be copied to the $HADOOP_HOME/lib directory on all hosts. The three files are:

s3mper-1.0.0.jar    
aspectjrt-1.7.3.jar           
aspectjweaver-1.7.3.jar

Hadoop Configuration

Three files need to be updated to enable s3mper:

Changes to $HADOOP_HOME/conf/hadoop-env.sh

This file needs to be updated to load the aspects using a java agent. Modify the HADOOP_OPTS variable like the following:

export HADOOP_OPTS="-javaagent:$HADOOP_HOME/lib/aspectjweaver-1.7.3.jar $HADOOP_OPTS"
Changes to $HADOOP_HOME/conf/core-site.xml

S3mper is disabled by default and must be explicitly enabled with the following option:

<property><name>s3mper.disable</name><value>false</value></property>
Changes to $HADOOP_HOME/conf/mapred-site.xml [Optional]

The child processes of the task trackers need to have the java agent included in the jvm options as well. If you updated the hadoop-env.sh on all hosts, this step should not be necessary and may cause the jvm to fail to start if there are two of the same java agent commands. The following can be added to the mapred-site.xml to add the java agent if the child processes don't have the agent enabled (assumes hadoop is installed in /opt/hadoop):

<property><name>mapred.child.java.opts</name><value>--javaagent:/opt/hadoop/lib/aspectjweaver-1.7.3.jar</value></property>
Detailed Logging $HADOOP_HOME/conf/log4j.properties [Optional]

To turn on detailed s3mper logging to see information about what s3mper is doing, add the following line to the log4j configuration:

log4j.logger.com.netflix.bdp.s3mper=trace

Configuration

Creating the DynamoDB Table

The easiest way to create the initial metastore in DynamoDB is to set s3mper.metastore.create=true and execute a command against the metastore. This will create the table in DynamoDB with the proper configuration. The read/write unit capacity can be configured via the AWS DyanmoDB console.

Creating SQS Queues

The SQS Queues used for alerting need to be created by hand (no tool exists to create them automatically). The default queue names are:

s3mper.alert.queue
s3mper.timeout.queue
s3mper.notification.queue

Messages will be delivered to these queues when a listing inconsistency is detected.

Configuration Options

S3mper supports a wide variety of options that can be controlled using properties from within a Pig/Hive/Hadoop job. These options manage how s3mper will respond to a in consistent listing when it occurs. The table below describes these options:

Property Default Description
s3mper.disable FALSE "Disables all functionality of s3mper. The aspect will still be woven, but it will not interfere with normal behavior."
s3mper.metastore.create FALSE Create the metastore if it doesn't exist. Note that this isn't an atomic operation so commands may fail until the metastore is fully initialized. It's best to create the metastore prior to running any commands.
s3mper.failOnError FALSE "If true, any inconsistent list operation will result in a S3ConsistencyException being thrown, logging an error, and notifying CloudWatch. If false, the exception will not be thrown but the log and CloudWatch metric will still be sent."
s3mper.task.failOnError FALSE Controls whether a M/R Task (i.e. a Child TaskTracker process) will fail if the task fails a consistency check. Most queries only check at the start of the query.
s3mper.listing.recheck.count 15 How many times to recheck the listing. This works in combination with 's3mper.listing.recheck.period' to control how long to wait before failing/proceeding with the query.
s3mper.listing.task.recheck.count 0 How many times to recheck listing within a MapReduce task context (i.e. a Child task executing on the EMR cluster). This is handled separately from other cases because it may cause the task to timeout. In general listing is done prior to executing the task.
s3mper.listing.recheck.period 60000 How long to wait (in Milliseconds) between checks defined by 's3mper.listing.recheck.count'
s3mper.listing.task.recheck.period 0 How long to wait (in Milliseconds) between checks defined by 's3mper.listing.task.recheck.count'
s3mper.metastore.deleteMarker.enabled FALSE "Use a delete marker instead of removing the entry from the metastore. This will fix the second type of consistency problem where a file is deleted, but the listing still shows that it is available by removing those deleted files from the listing."
s3mper.listing.directory.tracking FALSE Track directory creation/deletion in the metastore.
s3mper.listing.delist.deleted TRUE "Removes files from the listing that have delete markers applied to them. If delete markers is enabled, this should also be enabled or the listing will expect files that are actually deleted."
s3mper.metastore.impl <see code> The fully qualified class with metastore implementation.
s3mper.dispatcher.impl <see code> The fully qualified class with alert dispatcher implementation.
fs.<scheme>.awsAccessKeyId Key to use for DynamoDB access
fs.<scheme>.awsSecretAccessKey Secret to use for DynamoDB access
s3mper.override.awsAccessKeyId Key to use for DynamoDB access if different from the default key
s3mper.override.awsSecretAccessKey Secret to use for DynamoDB access if different from the default key
s3mper.metastore.read.units 500 The number of read units to provision on create. Only used if the table does not exist.
s3mper.metastore.write.units 100 The number of write units to provision on create. Only used if the table does not exist.
s3mper.metastore.name ConsistentListingMetastore The name of the DynamoDB table to use.

Verification

A unit test is included with the library that exercises the advised commands on the file system. Build the test jars with the command ./gradlew testJar and include the s3mpter-test.jar in the classpath. Use the scripts/verify-consistent-listing.sh command to run the unit tests.

Note: you may need to modify some paths in the script to point to the correct directories.

Administration

S3mper is intended to only provide consistency guarentees for a "window" of time. This means that entries are removed from the secondary index after a set period of time from which point the S3 index is expected to be consistent. A commandline admin tool is provided that allows for configurable cleanup of expired entries in the secondary index. To use the admin tool, unpack the tar file produced in the Building section. The command can be found at the root level and is simply s3mper.

Note: you many need to modify the classpath in the s3mper script to point to the s3mper lib directory from the tar file.

To run the command, use ./s3mper <meta | sqs | fs> <options>

A cron job (or similar scheduled job) should be configured to remove expired entries in DynamoDB. The following command can be used to delete entries older than one day:

./s3mper meta delete_ts -ru 100 -wu 100 -s 1 -d 10 -u days -n 1 

# delete_ts   Use the timeseries index to delete entries
# ru          Max read units to consume in DynamoDB
# wu          Max write units to consume in DynamoDB
# s           Number of scan threads to use (Note: only use 1)
# d           Number of delete threads to use
# u           Time unit
# n           Number of time units          

Running the cleanup on a regular basis (every 30min) will limit the cleanup work required down and keep the consistency time window well regulated.

Issues

Please file issues on github here

Footnotes

  1. The indexes are not implemented using DynamoDB Secondary Indexes due to constrains ↩

  2. This is a static value 'epoch' ↩

More Repositories

1

Hystrix

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
Java
23,594
star
2

chaosmonkey

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
Go
14,410
star
3

zuul

Zuul is a gateway service that provides dynamic routing, monitoring, resiliency, security, and more.
Java
12,993
star
4

conductor

Conductor is a microservices orchestration engine.
Java
12,842
star
5

eureka

AWS Service registry for resilient mid-tier load balancing and failover.
Java
11,991
star
6

falcor

A JavaScript library for efficient data fetching
JavaScript
10,338
star
7

pollyjs

Record, Replay, and Stub HTTP Interactions.
JavaScript
10,184
star
8

metaflow

🚀 Build and manage real-life ML, AI, and data science projects with ease!
Python
8,012
star
9

SimianArmy

Tools for keeping your cloud operating in top form. Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
Java
7,955
star
10

fast_jsonapi

No Longer Maintained - A lightning fast JSON:API serializer for Ruby Objects.
Ruby
5,078
star
11

vmaf

Perceptual video quality assessment based on multi-method fusion.
Python
4,563
star
12

dispatch

All of the ad-hoc things you're doing to manage incidents today, done for you, and much more!
Python
4,548
star
13

ribbon

Ribbon is a Inter Process Communication (remote procedure calls) library with built in software load balancers. The primary usage model involves REST calls with various serialization scheme support.
Java
4,468
star
14

security_monkey

Security Monkey monitors AWS, GCP, OpenStack, and GitHub orgs for assets and their changes over time.
Python
4,347
star
15

dynomite

A generic dynamo implementation for different k-v storage engines
C
4,104
star
16

vizceral

WebGL visualization for displaying animated traffic graphs
JavaScript
4,047
star
17

vector

Vector is an on-host performance monitoring framework which exposes hand picked high resolution metrics to every engineer’s browser.
JavaScript
3,588
star
18

atlas

In-memory dimensional time series database.
Scala
3,331
star
19

concurrency-limits

Java
3,216
star
20

consoleme

A Central Control Plane for AWS Permissions and Access
Python
3,114
star
21

dgs-framework

GraphQL for Java with Spring Boot made easy.
Kotlin
3,044
star
22

flamescope

FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.
Python
2,979
star
23

bless

Repository for BLESS, an SSH Certificate Authority that runs as a AWS Lambda function
Python
2,722
star
24

archaius

Library for configuration management API
Java
2,435
star
25

asgard

[Asgard is deprecated at Netflix. We use Spinnaker ( www.spinnaker.io ).] Web interface for application deployments and cloud management in Amazon Web Services (AWS). Binary download: http://github.com/Netflix/asgard/releases
Groovy
2,235
star
26

curator

ZooKeeper client wrapper and rich ZooKeeper framework
Java
2,138
star
27

EVCache

A distributed in-memory data store for the cloud
Java
2,001
star
28

titus

1,995
star
29

lemur

Repository for the Lemur Certificate Manager
Python
1,651
star
30

bpftop

bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.
Rust
1,647
star
31

genie

Distributed Big Data Orchestration Service
Java
1,635
star
32

metacat

Java
1,555
star
33

netflix.github.com

HTML
1,419
star
34

servo

Netflix Application Monitoring Library
Java
1,408
star
35

mantis

A platform that makes it easy for developers to build realtime, cost-effective, operations-focused applications
Java
1,406
star
36

vectorflow

D
1,287
star
37

hubcommander

A Slack bot for GitHub organization management -- and other things too
Python
1,262
star
38

rend

A memcached proxy that manages data chunking and L1 / L2 caches
Go
1,174
star
39

hollow

Hollow is a java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.
Java
1,148
star
40

repokid

AWS Least Privilege for Distributed, High-Velocity Deployment
Python
1,104
star
41

astyanax

Cassandra Java Client
Java
1,034
star
42

Priam

Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.
Java
1,024
star
43

aminator

A tool for creating EBS AMIs. This tool currently works for CentOS/RedHat Linux images and is intended to run on an EC2 instance.
Python
938
star
44

Turbine

SSE Stream Aggregator
Java
831
star
45

governator

Governator is a library of extensions and utilities that enhance Google Guice to provide: classpath scanning and automatic binding, lifecycle management, configuration to field mapping, field validation and parallelized object warmup.
Java
821
star
46

Fido

C#
816
star
47

suro

Netflix's distributed Data Pipeline
Java
783
star
48

spectator

Client library for collecting metrics.
Java
743
star
49

security-bulletins

Security Bulletins that relate to Netflix Open Source
734
star
50

Fenzo

Extensible Scheduler for Mesos Frameworks
Java
703
star
51

msl

Message Security Layer
C++
687
star
52

unleash

Professionally publish your JavaScript modules in one keystroke
JavaScript
590
star
53

denominator

Portably control DNS clouds using java or bash
Java
573
star
54

blitz4j

Logging framework for fast asynchronous logging
Java
559
star
55

edda

AWS API Read Cache
Scala
554
star
56

PigPen

Map-Reduce for Clojure
Clojure
551
star
57

netflix-graph

Compact in-memory representation of directed graph data
Java
548
star
58

go-env

a golang library to manage environment variables
Go
542
star
59

karyon

The nucleus or the base container for Applications and Services built using the NetflixOSS ecosystem
Java
495
star
60

Prana

A sidecar for your NetflixOSS based services.
Java
492
star
61

iceberg

Iceberg is a table format for large, slow-moving tabular data
Java
465
star
62

Lipstick

Pig Visualization framework
JavaScript
464
star
63

Surus

Java
453
star
64

aws-autoscaling

Tools and Documentation about using Auto Scaling
Shell
429
star
65

go-expect

an expect-like golang library to automate control of terminal or console based programs.
Go
422
star
66

nf-data-explorer

The Data Explorer gives you fast, safe access to data stored in Cassandra, Dynomite, and Redis.
TypeScript
420
star
67

Workflowable

Ruby
370
star
68

osstracker

Github organization OSS metrics collector and metrics dashboard
Scala
365
star
69

vizceral-example

Example Vizceral app
JavaScript
363
star
70

ndbench

Netflix Data Store Benchmark
HTML
360
star
71

Raigad

Co-Process for backup/recovery, Auto Deployments and Centralized Configuration management for ElasticSearch
Java
346
star
72

recipes-rss

RSS Reader Recipes that uses several of the Netflix OSS components
Java
339
star
73

aegisthus

A Bulk Data Pipeline out of Cassandra
Java
323
star
74

weep

The ConsoleMe CLI utility
Go
322
star
75

metaflow-ui

🎨 UI for monitoring your Metaflow executions!
TypeScript
318
star
76

titus-control-plane

Titus is the Netflix Container Management Platform that manages containers and provides integrations to the infrastructure ecosystem.
Java
316
star
77

dyno-queues

Dyno Queues is a recipe that provides task queues utilizing Dynomite.
Java
264
star
78

image_compression_comparison

Image Compression Comparison Framework
Python
258
star
79

falcor-express-demo

Demonstration Falcor end point for a Netflix-style Application using express
HTML
246
star
80

gradle-template

Java
244
star
81

ember-nf-graph

Composable graphing component library for EmberJS.
JavaScript
241
star
82

falcor-router-demo

A demonstration of how to build a Router for a Netflix-like application
JavaScript
236
star
83

titus-executor

Titus Executor is the container runtime/executor implementation for Titus
Go
233
star
84

photon

Photon is a Java implementation of the Interoperable Master Format (IMF) standard. IMF is a SMPTE standard whose core constraints are defined in the specification st2067-2:2013
Java
233
star
85

dial-reference

C
228
star
86

ReactiveLab

Experiments and prototypes with reactive application design.
Java
208
star
87

inviso

JavaScript
205
star
88

NfWebCrypto

Web Cryptography API Polyfill
C++
205
star
89

staash

A language-agnostic as well as storage-agnostic web interface for storing data into persistent storage systems, the metadata layer abstracts a lot of storage details and the pattern automation APIs take care of automating common data access patterns.
Java
204
star
90

zeno

Netflix's In-Memory Data Propagation Framework
Java
200
star
91

brutal

A multi-network asynchronous chat bot framework using twisted
Python
200
star
92

vizceral-react

JavaScript
199
star
93

dispatch-docker

Shell
193
star
94

metaflow-service

🚀 Metadata tracking and UI service for Metaflow!
Python
187
star
95

pytheas

Web Resources and UI Framework
JavaScript
187
star
96

dyno

Java client for Dynomite
Java
184
star
97

hal-9001

Hal-9001 is a Go library that offers a number of facilities for creating a bot and its plugins.
Go
178
star
98

Nicobar

Java
171
star
99

lemur-docker

Docker files for the Lemur certificate orchestration tool
Python
170
star
100

yetch

Yet-another-fetch polyfill library. Supports AbortController/AbortSignal
JavaScript
168
star