• Stars
    star
    1,504
  • Rank 30,043 (Top 0.7 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created about 11 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

JSON to JSON transformation library written in Java.

Jolt

JSON to JSON transformation library written in Java where the "specification" for the transform is itself a JSON document.

Useful For

  1. Transforming JSON data from ElasticSearch, MongoDb, Cassandra, etc before sending it off to the world
  2. Extracting data from a large JSON documents for your own consumption

Table of Contents

  1. Overview
  2. Documentation
  3. Shiftr Transform DSL
  4. Demo
  5. Getting Started
  6. Getting Transform Help
  7. Why Jolt Exists
  8. Alternatives
  9. Performance
  10. CLI
  11. Code Coverage
  12. Release Notes

Overview

Jolt :

  • provides a set of transforms, that can be "chained" together to form the overall JSON to JSON transform.
  • focuses on transforming the structure of your JSON data, not manipulating specific values
    • The idea being: use Jolt to get most of the structure right, then write code to fix values
  • consumes and produces "hydrated" JSON : in-memory tree of Maps, Lists, Strings, etc.
    • use Jackson (or whatever) to serialize and deserialize the JSON text

Stock Transforms

The Stock transforms are:

shift       : copy data from the input tree and put it the output tree
default     : apply default values to the tree
remove      : remove data from the tree
sort        : sort the Map key values alphabetically ( for debugging and human readability )
cardinality : "fix" the cardinality of input data.  Eg, the "urls" element is usually a List, but if there is only one, then it is a String

Each transform has its own DSL (Domain Specific Language) in order to facilitate its narrow job.

Currently, all the Stock transforms just effect the "structure" of the data. To do data manipulation, you will need to write Java code. If you write your Java "data manipulation" code to implement the Transform interface, then you can insert your code in the transform chain.

The out-of-the-box Jolt transforms should be able to do most of your structural transformation, with custom Java Transforms implementing your data manipulation.

Documentation

Jolt Slide Deck : covers motivation, development, and transforms.

Javadoc explaining each transform DSL :

  • shift
  • default
  • remove
  • cardinality
  • sort
  • full qualified Java ClassName : Class implements the Transform or ContextualTransform interfaces, and can optionally be SpecDriven (marker interface)
    • Transform interface
    • SpecDriven
      • where the "input" is "hydrated" Java version of your JSON Data

Running a Jolt transform means creating an instance of Chainr with a list of transforms.

The JSON spec for Chainr looks like : unit test.

The Java side looks like :

Chainr chainr = JsonUtils.classpathToList( "/path/to/chainr/spec.json" );

Object input = elasticSearchHit.getSource(); // ElasticSearch already returns hydrated JSon

Object output = chainr.transform( input );

return output;

Shiftr Transform DSL

The Shiftr transform generally does most of the "heavy lifting" in the transform chain. To see the Shiftr DSL in action, please look at our unit tests (shiftr tests) for nice bite sized transform examples, and read the extensive Shiftr javadoc.

Our unit tests follow the pattern :

{
    "input": {
        // sample input
    },

    "spec": {
        // transform spec
    },

    "expected": {
        // what the output of the transform looks like
    }
}

We read in "input", apply the "spec", and Diffy it against the "expected".

To learn the Shiftr DSL, examine "input" and "output" json, get an understanding of how data is moving, and then look at the transform spec to see how it facilitates the transform.

For reference, this was the very first test we wrote.

Demo

There is a demo available at jolt-demo.appspot.com. You can paste in JSON input data and a Spec, and it will post the data to server and run the transform.

Note

  • it is hosted on a free Google App Engine instance, so it may take a minute to spin up.
  • it validates in input JSON and spec client side.

Getting Started

Getting started code wise has its own doc.

Getting Transform Help

If you can't get a transform working and you need help, create and Issue in Jolt (for now).

Make sure you include what your "input" is, and what you want your "output" to be.

Why Jolt Exists

Aside from writing your own custom code to do a transform, there are two general approaches to doing a JSON to JSON transforms in Java.

  1. JSON -> XML -> XSLT or STX -> XML -> JSON

Aside from being a Rube Goldberg approach, XSLT is more complicated than Jolt because it is trying to do the whole transform with a single DSL.

  1. Write a Template (Velocity, FreeMarker, etc) that take hydrated JSON input and write textual JSON output

With this approach you are working from the output format backwards to the input, which is complex for any non-trivial transform. Eg, the structure of your template will be dictated by the output JSON format, and you will end up coding a parallel tree walk of the input data and the output format in your template. Jolt works forward from the input data to the output format which is simpler, and it does the parallel tree walk for you.

Alternatives

Being in the Java JSON processing "space", here are some other interesting JSON manipulation tools to look at / consider :

  • jq - Awesome command line tool to extract data from JSON files (use it all the time, available via brew)
  • JsonPath - Java : Extract data from JSON using XPATH like syntax.
  • JsonSurfer - Java : Streaming JsonPath processor dedicated to processing big and complicated JSON data.

Performance

The primary goal of Jolt was to improve "developer speed" by providing the ability to have a declarative rather than imperative transforms. That said, Jolt should have a better runtime than the alternatives listed above.

Work has been done to make the stock Jolt transforms fast:

  1. Transforms can be initialized once with their spec, and re-used many times in a multi-threaded environment.
    • We reuse initialized Jolt transforms to service multiple web requests from a DropWizard service.
  2. "*" wildcard logic was redone to reduce the use of Regex in the common case, which was a dramatic speed improvement.
  3. The parallel tree walk performed by Shiftr was optimized.

Two things to be aware of :

  1. Jolt is not "stream" based, so if you have a very large Json document to transform you need to have enough memory to hold it.
  2. The transform process will create and discard a lot of objects, so the garbage collector will have work to do.

Jolt CLI

Jolt Transforms and tools can be run from the command line. Command line interface doc here.

Code Coverage

Build Status

For the moment we have Cobertura configured in our poms.

mvn cobertura:cobertura
open jolt-core/target/site/cobertura/index.html

Currently, for the jolt-core artifact, code coverage is at 89% line, and 83% branch.

Release Notes

Versions and Release Notes available here.

More Repositories

1

cloudformation-ruby-dsl

Ruby DSL for creating Cloudformation templates
Ruby
208
star
2

maven-process-plugin

Maven: start multiple processes in pre-integration-test phase in order.
Java
62
star
3

dropwizard-configurable-assets-bundle

An implementation of an AssetBundle for use in Dropwizard that allows user configuration.
Java
57
star
4

emodb

A distributed database with a built in streaming data platform
Java
56
star
5

swat-proxy

A node.js proxy server that makes injecting applications or prototype applications onto potential client websites easy.
JavaScript
51
star
6

HostedUIResources

Bazaarvoice Hosted UI Implementation Resources
Python
48
star
7

s3-upload-maven-plugin

Allows you to upload a file to S3 from maven
Java
39
star
8

scoutfile

A Node module for generating a scout file for a client-side JS app
JavaScript
34
star
9

jersey-hmac-auth

HMAC authentication for server and client
Java
31
star
10

ostrich

SOA Library
Java
27
star
11

jsonpps

Streaming JSON pretty printer
Java
22
star
12

rison

Rison encoder and decoder for the Jackson streaming JSON processor
Java
16
star
13

bv-android-sdk

Bazaarvoice Android SDK for Developers
Java
16
star
14

curator-extensions

Helpers that extend the functionality of curator.
Java
15
star
15

magento2-extension

The official Magento 2 extension for Bazaarvoice.
PHP
14
star
16

elasticsearch-hyperloglog

This repo will have a custom elasticsearch plugin which allows to do aggregation on stored binary representations of hyperloglogplus data structure.
Java
13
star
17

json-regex-difftool

A JSON to JSON diff tool
Python
13
star
18

bv-ios-sdk

Bazaarvoice iOS SDK for Developers
Objective-C
13
star
19

dropwizard-redirect-bundle

A simple bundle for DropWizard that allows for HTTP redirects
Java
13
star
20

python-hmac-auth

Python SDK for https://github.com/bazaarvoice/jersey-hmac-auth
Python
12
star
21

bv-ios-swift-sdk

Bazaarvoice's Swift SDK
Swift
11
star
22

dropwizard-webjars-bundle

Dropwizard bundle to make working with Webjars (http://webjars.org) easier.
Java
10
star
23

Bazaarvoice-Hello-World

A sample application using the Bazaarvoice API to show incoming product reviews animated on a Google Earth based globe
JavaScript
10
star
24

APITutorial

Tutorial for Javascript widget on Conversations API
HTML
9
star
25

stripe-ctf-2-vm

The Stripe Capture the Flag (CTF) 2 contest in a Virtualbox VM.
Shell
8
star
26

aws-cfn-custom-resource

A node.js library to help write CloudFormation custom resources.
JavaScript
8
star
27

super-simple-workflow

Scala
8
star
28

dropwizard-caching-bundle

Response caching bundle for Dropwizard resources
Java
8
star
29

bv-vger

Serverless team performance metrics application
JavaScript
7
star
30

cve-tools

Tools for dealing with CVE IDs and related vulnerability data from the National Vulnerability Database.
JavaScript
6
star
31

lassie

Java library that wraps the DataDog screenboard REST API.
Java
6
star
32

BVSnippets

Jquery plugin that injects BV snippets onto a page (i.e. a featured review, inline ratings, etc.)
JavaScript
6
star
33

awslocal

Partial implementation of AWS Java SDK that operates locally
Java
5
star
34

seo_sdk_php

Bazaarvoice SEO SDK for PHP
PHP
5
star
35

pr_releasenotes

Generate release notes from github pull requests
Ruby
5
star
36

s3repo-maven-plugin

Update an S3 YUM repository with artifacts from Maven.
Java
4
star
37

bv-ui-core

A Node module for common Bazaarvoice UI code.
JavaScript
4
star
38

s3repo-gradle-plugin

Plugin for Gradle to deploy to an S3 Yum repo
Java
4
star
39

bv-varnish-cache

A concrete example of how to use Varnish as caching solution in client side.
Perl
4
star
40

social-alerts-api

Bazaarvoice Social Alerts API allows 3rd-party E-mail Service Providers to implement connectors allowing Bazaarvoice to integrate its Interaction Suite with E-mail Provider's system very quickly.
Java
3
star
41

response-demo

Example implementation of the Bazaarvoice Response API using OAuth2
JavaScript
3
star
42

seo_sdk_java

Bazaarvoice SEO SDK for Java
HTML
2
star
43

unshackle

A JavaScript software release script runner.
JavaScript
2
star
44

commons-data-dao

BVCommons Data DAO
Java
2
star
45

export-tools

export-tools
JavaScript
2
star
46

cordova-plugin-bvsdk

JavaScript
2
star
47

api-analytics

API Analytics
CSS
1
star
48

cookiecutter-actions-demo

Demo CookieCutter template used for GitHub Actions Demo
Makefile
1
star
49

bv-ui-pixels-displayed

A module for detecting whether an element is visible.
JavaScript
1
star
50

Curations-API-Pixel-tutorial

Tutorial on how Curations API clients should implement the BV pixel
CSS
1
star
51

hackathon-craze2Hack-bot

JavaScript
1
star
52

api-submission-form

Tutorial to use a Conversation API response to dynamically set the sequence of HTML inputs on a submission form.
JavaScript
1
star