• Stars
    star
    278
  • Rank 148,454 (Top 3 %)
  • Language
    JavaScript
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sepia is a VCR-like module for node.js that records HTTP interactions, then plays them back exactly like the first time they were invoked

sepia - the way things used to be

Sepia is a VCR-like module for node.js that records HTTP interactions, then plays them back exactly like the first time they were invoked. Sepia was created to isolate a server from its remote downstream dependencies, for speed and fault-tolerence.

Sepia should work with any HTTP library in node.js that uses http#request and https#request. In practice, it has been extensively tested against the request module, and there is a test to ensure it works with the then-request module.

Sepia was developed and is in use at LinkedIn since early 2013. There, it is used to improve the speed and reliability of the integration test suite for the node.js server powering the mobile applications.

https://github.com/linkedin/sepia
https://npmjs.org/package/sepia

Quick Start

Install the module.

npm install sepia

Plop it into your application:

require('sepia');

Now, when you start your application, run it with the VCR_MODE environment variable set to the correct value:

npm start                   # no sepia
VCR_MODE=record npm start   # sepia, in record mode
VCR_MODE=playback npm start # sepia, in playback mode
VCR_MODE=cache npm start    # sepia, in cache mode

Running the examples

cd sepia # wherever you installed the module
npm install
time VCR_MODE=record   node examples/http
time VCR_MODE=playback node examples/http # notice it's much faster!

The example is located in examples/http.js. It exercises the core functionality of the module.

cd sepia
npm install
rm -r fixtures # in case you had previously generated fixtures
VCR_MODE=cache node examples/cache

This example demonstrates the cache mode, which makes a real HTTP request and records it if the fixture does not exist, but then reuses the fixture if it does exist. Notice that the first call takes about one second, whereas the second call finishes quickly.

To run all the examples in the correct modes, run:

npm test

Motivation

Sepia was created for the following use case:

  • Integration tests are being run against a node.js server under test.
  • The server under test makes HTTP requests to external downstream services.
  • The integration tests are driven by a client running in a separate process than the server.

Even though the server is the system being tested, the stability of the integration tests depends on the reliability of the downstream services. Additionally, making HTTP calls to live downstream services makes the integration tests very slow. To combat this, sepia hooks into the node.js http and https modules inside the server process, intercepting outgoing HTTP(S) requests. Sepia, records these requests, then plays them back the next time the requests are made.

VCR Modes

The value of the VCR_MODE environment variable determines how sepia behaves. Acceptable values are:

  • record: Make the downstream request, then save it to a fixture file.
  • playback: Don't make the downstream request. Attempt to retrieve the data from the corresponding fixture file, and throw an error if the file does not exist.
  • cache: First try to locate the fixture and play it back. If the fixture file does not exist, make the downstream request and save it to the file.

Fixture Filenames

Fixture data generated during the recording phase are stored in files. In order to uniquely associate each HTTP request with a filename used to store the fixture data, several characteristics of the request are examined:

  • The HTTP method, e.g. GET or POST.
  • The request URL.
  • The request body.
  • The names of all the request headers.
  • The names of all the cookies sent in the request.

This data is then aggregated and sent through an MD5 hash to produce the filename. Users of sepia can hook into this process of constructing the filename, as explained in a subsequent sections.

This core functionality is exercised in examples/http.js and examples/request.js:

time VCR_MODE=record   node examples/http
time VCR_MODE=playback node examples/http

time VCR_MODE=record   node examples/request
time VCR_MODE=playback node examples/request

Fixture Data

By default, the files are stored in fixtures/generated under the directory in which the application was started. To override this:

var sepia = require('sepia');
sepia.fixtureDir(path.join(process.cwd(), 'sepia-fixtures'));

If this directory doesn't exist, it will be created.

This functionality is exercised in examples/fixtureDir:

VCR_MODE=record   node examples/fixtureDir
VCR_MODE=playback node examples/fixtureDir

Configure

Sepia can be optionally configured using a call to sepia.configure(). All options have default values, so they need not be configured unless necessary.

var sepia = require('sepia');
sepia.configure({
  verbose: true,
  debug: true
});

The full list of options are as follows:

  • verbose: outputs extra data whenever a fixture is accessed, along with the parts used to create the name of the fixture.

  • includeHeaderNames, headerWhitelist, includeCookieNames, cookieWhitelist: detailed in a later section.

  • 'debug': Useful for debugging the requests when there is a cache miss. If fixtures are recorded with debug mode true, Sepia will additionally save all the incoming requests as '.request' files. Furthermore, in case of a cache miss, during playback, it will attempt to compare the the missing request(.missing), against all the available saved requests(.requests) to find the best match, by computing the string distance between each. The output will be the most similar request fixture, having the least string distance. Based on this url and body filters can be added which is explained in the next section.

    For performance and to minimize the search space & space complexity, it is recommended to have fixtures saved in separate folders per test or test suite. The debug feature is still under development and we will continue to refine it in the upcoming releases.

URL and Body Filtering

Both the URL and the request body, if present, are used to generate the filename for fixtures. The latter is used to differentiate between two POST or PUT requests pointing to the same URL but differing only in the request body.

Sometimes, a request contains data in the URL or the body that is necessary for the successful execution of that request, but changes from repeated invocations of that resource. One typical example is a timestamp; another is a uniquely generated request ID. However, sometimes two requests that have all other parts of the request aside from these parameters constant should be considered the same for recording and playback purposes.

To this end, a URL and body filtering functionality is provided. Suppose that your tests make the following request:

request('http://example.com/my-resource?time=' + Date.now(), next);

and while the time query parameter is required for the request to complete, you want to playback the same data that was recorded, regardless of what timestamp was used during recording and during playback. Use a URL filter:

var sepia = require('sepia');
sepia.filter({
  url: /my-resource/,
  urlFilter: function(url) {
    return url.replace(/time=[0-9]+/, '');
  }
});

The url field is used to determine which requests should have urlFilter applied to it. The matcher is a regex. The filter is only applied to determine which fixture will be used; the actual request made to the remote resource during recording is unchanged.

The filter specification can also contain a bodyFilter function that operates on the request body. Either urlFilter or bodyFilter may be specified.

Multiple calls to sepia#filter may be made. All matching filters are applied in the order they are specified. The url property of the filter is used to match the unmodified URL, regardless of the transformations it undergoes due to matching urlFilter functions.

An example of this functionality can be found in examples/filters:

VCR_MODE=record   node examples/filters
VCR_MODE=playback node examples/filters

Headers and Cookies

HTTP headers and cookies are often relevant to the way requests are served, but their exact values are often highly variable. For example, the presence of certain cookies may affect the authentication mechanism used behind the scenes, and while one may wish to exercise both mechanisms, it is not useful to require that the actual authentication cookie have a particular value.

Sepia generates filenames based on the presence and absence of header and cookie names. In particular, all the header names are lower-cased and sorted alphabetically, and this list is used to construct the fixture filename corresponding to a request. The same applies to the cookie names.

If this feature is not desired, it can be disabled by calling sepia.configure():

var sepia = require('sepia');
sepia.configure({
  includeHeaderNames: false,
  includeCookieNames: false
});

Additionally, a whitelist can be specified for the headers or for the cookies. If the whitelist is empty, as is the default, all header names and cookie names will be used to construct the fixture filename. If either whitelist has any strings in it, only the corresponding headers or cookies will be used to construct the filename. Either whitelist can be specified in isolation or both may be specified:

var sepia = require('sepia');
sepia.configure({
  headerWhitelist: ['upgrade', 'via', 'x-custom'],
  cookieWhitelist: ['oldAuth', 'newAuth']
});

Note that capitalization does not matter.

Examples of this functionality can be seen in examples/headers.js:

rm -r fixtures # in case you had previously generated fixtures
VCR_MODE=cache node examples/headers

Languages

A downstream request may return different data based on the language requested by the server under test. To support this use case, sepia automatically isolates fixtures based on the value of the Accept-Language request header.

The first language in the list of languages specified by this header is used as the directory name into which the fixtures will be placed for that request. This directory is placed under the configured fixture directory. If no languages are specified, either due to an empty value or due to the header not being present in the first place, the fixtures will be placed directly into the configured fixture directory.

Examples of this functionality can be seen in examples/languages.js:

rm -r fixtures # in case you had previously generated fixtures
VCR_MODE=record   node examples/languages
VCR_MODE=playback node examples/languages

VCR Cassettes

A series of downstream requests can be isolated, and their fixtures stored in a separate directory, using sepia.fixtureDir(). However, this requires that the grouping happens in the same process as the one running sepia. In the motivating example given at the beginning of this document, the integration test driver runs in a completely different process than the server managed by sepia.

To help manage the sepia instance in a separate process, sepia itself can start up an embedded HTTP server in the process where it replaces the HTTP request functions. The test process can then communicate with this HTTP server and set options, namely the directory into which fixtures will go. This architecture is is visualized as follows:

This can be enabled by asking to start up the embedded server:

var sepia = require('sepia').withSepiaServer();

Note that because this causes a new server to be started, the process that includes sepia should shutdown the server as follows:

sepia.shutdown();

This can be used to emulate "cassette"-like functionality:

// suppose the process that is running sepia is bound to port 8080
// in the test process
request.post({
  url: 'localhost:58080/testOptions', // sepia's embedded server
  json: {
    testName: 'test1'
  }
}, function(err, res, body) {
  // now, all requests made by localhost:8080 will have their fixtures
  // isolated into a directory name 'test1'
  request.get({
    url: 'localhost:8080/makeDownstreamRequests'
  });
});

Note that the functionality of setting the test options will be available in a sepia client library in the future.

Currently, the port of the embedded server is hard-coded to be 58080, but this will be configurable in the future. Furthermore, only the "test name" can be set, but more options may become available.

An example of this functionality can be seen in examples/testName.js:

rm -r fixtures # in case you had previously generated fixtures
VCR_MODE=cache node examples/testName

Bypassing the Cassette

When isolating a group of fixtures into a separate directory, it is sometimes useful to specify a single fixture as "global," that is living outside the test-specific directory and shared by multiple tests. To achieve this, a filter can be added:

var sepia = require('sepia');
sepia.filter({
  url: /my-global-resource/,
  global: true
});

Now, all requests whose URLs match /my-global-resource/ will be placed in the root of the configured fixtureDir, regardless of what the current test name is.

Cassettes Without Modifying Global State

The above approach to VCR cassettes modifies global state in the server managed by sepia. This prevents running multiple tests--with different test names--in parallel, because the nature of the global state is such that only one test name can be set at one time. If you're willing to pass along information from an incoming request down to a downstream request, sepia provides a stateless alternative: the x-sepia-test-name header.

The x-sepia-test-name header, when passed to a downstream request, will override the globally-configured test name. The header itself is not passed to any downstream service, nor is the header name used in the calculation of the fixture name.

The downside is that the server under test must pass along information from the test integration runner to each of its downstream requests, because otherwise, sepia has no means of determining the associated test name for a particular dowstream request.

Limitations

Repeated Identical HTTP Requests

If the same request returns different data during different invocations, sepia has no way of differentiating between the two invocations. This can happen when, for example, a resources is fetched using a GET request, it is modified using a PUT request, and it is fetched once more using a GET request to verify that it was updated successfully.

While you can use the test name functionality described above, it may not be semantically valid to spread fixtures for the same test under multiple directories. One way around this currently is to actually make the requests different in some way.

For example, in an integration test scenario, you may be able to pass a unique identifier (e.g. testUpdate1 and testUpdate2) along with each request made from the test. Typically, this would be passed as a query parameter that would be passed along by the server under test to any downstream services, which would then ignore this parameter.

Technical Details

Sepia wraps around the http#request and the https#request functions. Each outgoing request is trapped. Depending on the value of the VCR_MODE environment variable, the request is either made and stored in a file, or the data is retrieved from a file and sent back using a dummy response object.

Contributors

More Repositories

1

hopscotch

A framework to make it easy for developers to add product tours to their pages.
JavaScript
4,200
star
2

LayoutKit

LayoutKit is a fast view layout library for iOS, macOS, and tvOS.
Swift
3,162
star
3

camus

LinkedIn's previous generation Kafka to HDFS pipeline.
Java
883
star
4

indextank-engine

Indexing engine for IndexTank
Java
844
star
5

LIExposeController

Expose style navigation for iOS apps
Objective-C
742
star
6

Selene

iOS library which schedules the execution of tasks on a background fetch
Objective-C
642
star
7

datafu

Hadoop library for large-scale data processing, now an Apache Incubator project
Java
585
star
8

cleo

A flexible, partial, out-of-order and real-time typeahead search library
Java
559
star
9

sensei

distributed realtime searchable database
Java
540
star
10

inject

AMD and CJS dependency management in the browser
JavaScript
464
star
11

indextank-service

The API, BackOffice, Storefront, and Nebulizer for IndexTank
Python
382
star
12

venus.js

where bugs go to die
JavaScript
298
star
13

Fiber

Lightweight JavaScript prototypal inheritance model
JavaScript
279
star
14

JTune

A high precision Java CMS optimizer
Python
271
star
15

scanns

A scalable nearest neighbor search library in Apache Spark
Scala
253
star
16

Cubert

Fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop
Java
247
star
17

naarad

Naarad is a highly configurable system analysis tool that parses and plots timeseries data for better visual correlation. Naarad was built to help in performance analysis and investigations.
Python
238
star
18

simoorg

Failure inducer framework
Python
191
star
19

white-elephant

Hadoop log aggregator and dashboard
Java
191
star
20

nginx-config-builder

A python library for building nginx configuration files programatically
Python
170
star
21

Zopkio

A Functional and Performance Test Framework for Distributed Systems
Python
160
star
22

fossor

A plugin-oriented tool for automating the investigation of broken hosts and services.
Python
158
star
23

api-get-started

LinkedIn REST API Getting Started Tutorial
Java
158
star
24

dustjs-helpers

Helpers for dustjs-linkedin
JavaScript
114
star
25

archetype

Archetype is a Compass/Sass based framework for authoring configurable, composable UI components and patterns.
Ruby
102
star
26

Isaac

This library parses data from JSON objects into NSObject models without needing to write parsing code for each model.
Objective-C
97
star
27

linkedin-utils

Base utilities shared by all linkedin open source projects
Java
88
star
28

lafayette

Lafayette is a system to store various email abuse reports sent in ARF.
Python
74
star
29

rest.li-api-hub

API Hub is a web UI for browsing and searching a catalog of Rest.li APIs.
Scala
73
star
30

jaqen

Jaqen - Simple DNS rebinding
Go
70
star
31

Backbone.TableView

Backbone View to render collections as tables
CoffeeScript
70
star
32

linkedin-zookeeper

This project provides utilities and wrappers around ZooKeeper
Java
64
star
33

sometime

A BurpSuite plugin to detect Same Origin Method Execution vulnerabilities
Java
60
star
34

RookBoom

A web application for creating meetings.
Scala
45
star
35

datacl

A collection of efficient utilities for a data scientist.
C
40
star
36

mobster

Mobster is a tool that can help you get deeper understanding into the performance of mobile web applications on real mobile devices
Python
38
star
37

vagrant-autodns

Vagrant plugin for automagically managing guest DNS
Ruby
36
star
38

dmarc-msys

This set of scripts in Lua implements DMARC policy checking and reporting for the Message Systems MTA products, a popular extendable commercial MTA.
Lua
36
star
39

talkin

TalkIn is an interface providing safe and easy unidirectional cross-document communication.
JavaScript
31
star
40

play-testng-plugin

TestNG runner for the Play Framework 2.4
Java
24
star
41

sin

JavaScript
24
star
42

insframe

Central hub for distributing web apps to multiple browsers on multiple environments
JavaScript
22
star
43

Tachyon-iOS

Tachyon provides configurable UI components for iOS that are commonly used in calendar features and applications.
Objective-C
21
star
44

postcss-lang-optimizer

PostCSS plugin to extract language specific CSS rulesets to their own CSS files to optimize stylesheet delivery.
JavaScript
21
star
45

bowser

Extensible language parser with Python-like syntax. Written in Java and antlr.
Java
18
star
46

adfullssl

AdFullSsl is a tool that can automatically detect SSL non-compliant ads and fix them
Python
16
star
47

dustjs-filters-secure

extend dustjs-linkedin to enhance the filters methods
JavaScript
15
star
48

gradle-plugin-insight

Automatic, effortless, accurate documentation for any Gradle plugin
Groovy
13
star
49

timingz.js

Measure code execution in the browser and derive statistical data
JavaScript
13
star
50

Idiomatic-JSLint

JavaScript
12
star
51

streaming

10
star
52

custom-gradle-plugin-portal

An example implementation of a gradle plugin portal.
Java
9
star
53

sbt-restli

A collection of sbt plugins providing build integration for the rest.li REST framework
Scala
9
star
54

PTYHooks

Python
9
star
55

MTBT

Java
9
star
56

inject-bower

Please use linkedin/inject
JavaScript
6
star
57

rest.li-skeleton.g8

Rest.li tool for generating skeleton rest.li projects.
Shell
5
star
58

naarad-examples

Example logs and configs for naarad
3
star
59

html5-presentation

Code for the "Building a Performant HTML5 App" presentation at http://www.meetup.com/SF-Web-Performance-Group/events/71651452/
2
star