• Stars
    star
    305
  • Rank 136,879 (Top 3 %)
  • Language XSLT
  • Created almost 15 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

XSLTJSON - Convert XML to JSON using XSLT

XSLTJSON: Transforming XML to JSON using XSLT

XSLTJSON is an XSLT 2.0 stylesheet to transform arbitrary XML to JavaScript Object Notation (JSON). JSON is a lightweight data-interchange format based on a subset of the JavaScript language, and often offered as an alternative to XML inโ€”for exampleโ€”web services. To make life easier XSLTJSON allows you to transform XML to JSON automatically.

XSLTJSON supports several different JSON output formats, from a compact output format to support for the BadgerFish convention, which allows round-trips between XML and JSON. To make things even better, it is completely free and open-source. If you do not have an XSLT 2.0 processor, you can use XSLTJSON Lite, which is an XSLT 1.0 stylesheet to transforms XML to the JSONML format.

Usage

There are three options in using XSLTJSON. You can call the stylesheet from the command line, programmatically, or import it in your own stylesheets.

The stylesheet example below would transform any node matching my-node to JSON. If you import XSLTJSON in your stylesheet, you have to add the JSON namespace xmlns:json="http://json.org/" to your stylesheet because all functions and templates are in that namespace. The json:generate() function takes a XML node as input, generates a JSON representation of that node and returns it as an xs:string. This is the only function you should call from your stylesheet.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:json="http://json.org/">
    <xsl:import href="xml-to-json.xsl"/>
    <xsl:template match="my-node">
        <xsl:value-of select="json:generate(.)"/>
    </xsl:template>
</xsl:stylesheet>

If your stylesheet's sole purpose is to transform XML to JSON, it would be easier to use the xml-to-json.xsl stylesheet directly from the command line. The following line shows how to do that using Java and Saxon.

java net.sf.saxon.Transform source.xml xml-to-json.xsl

You can also call the stylesheet programmatically, but this depends heavily on your programming environment, so please consult the documentation of your programming language or XSLT processor.

Parameters

There are five Boolean parameters to control the stylesheet, and all are turned off by default (set to false().) You can control them from the command line, from your program or from another stylesheet. Four of the parameters are used to control the output format and are discussed in more detail in the section on output formats.

  • use-badgerfish โ€” Use the BadgerFish convention to output JSON without XML namespaces.
  • use-rabbitfish โ€” Output basic JSON with an @ to mark XML attributes.
  • use-rayfish โ€” Use the Rayfish convention to output JSON without XML namespaces.
  • use-namespaces โ€” Output XML namespaces according to the BadgerFish convention.
  • debug โ€” Enable or disable the output of the temporary XML tree used to generate JSON. Note that turning this on invalidates the JSON output.
  • jsonp โ€” Enable JSONP; prepend the JSON output with the given string. Defaults to an empty string.
  • skip-root โ€” Enable or disable skipping the root element and returning only the child elements of the root. Disabled by default.

For example; to transform source.xml to BadgerFish JSON with Saxon, you would invoke the following on the command line:

java net.sf.saxon.Transform source.xml xml-to-json.xsl use-badgerfish=true()

For other options consult the Saxon manual, or your XSLT processor's documentation.

If you import the stylesheet in your own stylesheet you can override the default parameters by redefining them. So if you want to output JSON using the BadgerFish convention, you should add the following parameter definition to your stylesheet.

    <xsl:param name="use-badgerfish" as="xs:boolean" select="true()"/>

You can force the creation of an array by adding the force-array parameter to your XML. So instead of creating two nested objects, the following example will create an object containing an array.

<list json:force-array="true" xmlns:json="http://json.org/">
  <item>one</item>
</list>

{list: {item: ['one']}}

The force-array attribute will not be copied to the output JSON .

Output formats

There are four output formats in XSLTJSON, which one to use depends on your target application. If you want the most compact JSON, use the basic output. If you want to transform XML to JSON and JSON back to XML, use the BadgerFish output. If you want something in between, you could use the RabbitFish output; which is similar to the basic version, but does distinguish between elements and attributes. If you're dealing with a lot of data centric XML, you could use the highly structured Rayfish output. All four output formats ignore XML namespaces unless the use-namespaces parameter is set to true(), in which case namespaces are created according to the BadgerFish convention.

Each format has a list of rules by which XML is transformed to JSON. The examples for these rules are all but one taken from the BadgerFish convention website to make comparing them easier.

Basic output (default)

The purpose of the basic output is to generate the most compact JSON possible. This is useful if you do not require round-trips between XML and JSON or if you need to send a large amount of data over a network. It borrows the $ syntax for text elements from the BadgerFish convention but attempts to avoid needless text-only JSON properties. It also does not distinguish between elements and attributes. The rules are:

  • Element names become object properties.

  • Text content of elements goes directly in the value of an object.

     <alice>bob</alice>
    

    becomes

     { "alice": "bob" }
    
  • Nested elements become nested properties.

     <alice><bob>charlie</bob><david>edgar</david></alice>
    

    becomes

     { "alice": { "bob": "charlie", "david": "edgar" } }
    
  • Multiple elements with the same name and at the same level become array elements.

    <alice><bob>charlie</bob><bob>david</bob></alice>
    

    becomes

    { "alice": { "bob": [ "charlie", "david" ] } }
    
  • Mixed content (element and text nodes) at the same level become array elements.

    <alice>bob<charlie>david</charlie>edgar</alice>
    

    becomes

    { "alice": [ "bob", { "charlie": "david" }, "edgar" ] }
    
  • Attributes go in properties.

    <alice charlie="david">bob</alice>
    

    becomes

    { "alice": { "charlie": "david", "$": "bob" } }
    

BadgerFish convention (use-badgerfish)

The BadgerFish convention was invented by David Sklar ; more detailed information can be found on his BadgerFish website. I have taken some liberties in supporting BadgerFish, for example the treatment of mixed content nodes (nodes with both text and element nodes as children) which was not covered in the convention (except for a mention in the to-do list) but is supported by XSLTJSON. The other change is that namespaces are optional instead of mandatory (which is also mentioned in the to-do list.) The rules are:

  • Element names become object properties.

  • Text content of elements goes in the $ property of an object.

    <alice>bob</alice>
    

    becomes

    { "alice": { "$": "bob" } }
    
  • Nested elements become nested properties.

    <alice><bob>charlie</bob><david>edgar</david></alice>
    

    becomes

    { "alice": {"bob": { "$": "charlie" }, "david": { "$": "edgar" } } }
    
  • Multiple elements with the same name and at the same level become array elements.

    <alice><bob>charlie</bob><bob>david</bob></alice>
    

    becomes

    { "alice": { "bob": [ { "$": "charlie" }, { "$": "david" } ] } }
    
  • Mixed content (element and text nodes) at the same level become array elements.

    <alice>bob<charlie>david</charlie>edgar</alice>
    

    becomes

    { "alice": [ { "$": "bob" }, { "charlie": { "$": "david" } }, { "$": "edgar" } ] }
    
  • Attributes go in properties whose name begin with @ .

    <alice charlie="david">bob</alice>
    

    becomes

    { "alice": { "@charlie": "david", "$": "bob" } }
    

RabbitFish (use-rabbitfish)

RabbitFish is identical to the basic output format except that it uses Rule 6 โ€œAttributes go in properties whose name begin with @โ€ from the BadgerFish convention in order to distinguish between elements and attributes.

Rayfish (use-rayfish)

The Rayfish convention was invented by Micheal Matthew and aims to create highly structured JSON which is easy to parse and extract information from due to its regularity. This makes it an excellent choice for data centric XML documents. The downside is that it does not support mixed content (elements and text nodes at the same level) and is slightly more verbose than the other output formats. The rules are:

  • Elements are transformed into an object with three properties: #name, #text and #children. The name property contains the name of the element, the text property contains the text contents of the element and the children property contains an array of the child elements.

    <alice/>
    

    becomes

    { "#name": "alice", "#text": null, "#children": [ ] }
    
  • Nested elements become members of the #children property of the parent element.

    <alice><bob>charlie</bob><david>edgar</david></alice>
    

    becomes

    { "#name": "alice", "#text": null, "#children": [ 
        { "#name": "bob", "#text": "charlie", "#children": [ ] }, 
        { "#name": "david", "#text": "edgar", "#children": [ ] }
    ]}
    
  • Attributes go into an object in the #children property and begin with @ .

    <alice charlie="david">bob</alice>
    

    becomes

    { "#name": "alice", "#text": "bob", "#children": [ 
        { "#name": "@charlie", 
          "#text": "david", 
          "#children": [ ] 
        }
    ]}
    

Namespaces (use-namespaces)

When turned on, namespaces are created according to the BadgerFish convention. In basic output, the @ is left out of the property name.

XSLTJSON Lite (XSLT 1.0 compatible)

The XSLTJSON Lite stylesheet transforms arbitrary XML to the JSONML format. It is written in XSLT 1.0, so it is compatible with all XSLT 1.0 and 2.0 processors, as well as the XSLT processor built into most modern browsers (for client-side transformations.) The stylesheet doesn't take any parameters and has no configurable options. Use it like you would use any XSLT stylesheet.

Requirements

XSLTJSON requires an XSLT 2.0 processor. An excellent option is Saxon, which was used to test and develop XSLTJSON.

XSLT 2.0?

Don't have an XSLT 2.0 processor? Check out Micheal Matthew's Rayfish project, xml2json, or a modified xml2json version by Martynas Juseviฤius. You can also use XSLTJSON Lite to transform XML to JSONML.

License

XSLTJSON is licensed under the new BSD License (see the header comment.)

Credits

Thanks to: Chick Markley (Octal number & numbers with terminating period fix), Torben Schreiter (Suggestions for skip root, and newline entities bug fix), Michael Nilsson (Bug report & text cases for json:force-array), Rick Brown (bug report and fix for numbers starting with '+' symbol).

More Repositories

1

fontfaceobserver

Webfont loading. Simple, small, and efficient.
JavaScript
4,235
star
2

typeset

TeX line breaking algorithm in JavaScript
JavaScript
980
star
3

hypher

A fast and small JavaScript hyphenation engine
JavaScript
564
star
4

trmix

apply CSS based on your browser's text rendering engine
JavaScript
500
star
5

homebrew-webfonttools

Homebrew formulae for font tools
Ruby
359
star
6

fontloader

A fontloader polyfill
JavaScript
324
star
7

jlayout

JavaScript layout algorithms
JavaScript
283
star
8

funcy

An experiment in adding functional pattern matching to JavaScript
JavaScript
247
star
9

url-template

A JavaScript URI template implementation (RFC 6570 compliant)
JavaScript
179
star
10

opentype

An OpenType, TrueType, WOFF, and WOFF2 parser in JavaScript
JavaScript
133
star
11

sfnt2woff-zopfli

WOFF utilities with Zopfli compression
C
126
star
12

promis

A small embeddable Promise polyfill
JavaScript
97
star
13

postcss-scale

PostCSS plugin to scale values from one range to another.
HTML
80
star
14

bit-array

JavaScript implementation of bit arrays.
JavaScript
78
star
15

hyphenation-patterns

Hyphenation patterns for use with Hypher
JavaScript
74
star
16

stateofwebtype

Up-to-date data on support for type and typographic features on the web.
JavaScript
65
star
17

junify

JUnify โ€• JavaScript Unification Library
JavaScript
48
star
18

text-overflow

jQuery Text Overflow plugin
JavaScript
43
star
19

jsizes

jQuery CSS size properties plugin
JavaScript
37
star
20

characterset

A library for creating and manipulating character sets
JavaScript
29
star
21

css-font-parser

A parser for the CSS font values
JavaScript
26
star
22

jslint

JSLint: The JavaScript Quality Tool, command line version (Node.js)
JavaScript
25
star
23

datrie

A JavaScript Double Array Trie
JavaScript
21
star
24

unicode-tokenizer

Unicode Tokenizer following the Unicode Line Breaking algorithm
JavaScript
20
star
25

nanofont

A nano font for testing font format support
Makefile
19
star
26

node-typekit

A minimal Typekit API client in Node.js
JavaScript
19
star
27

knockout.selection

A selection binding for Knockout.js
JavaScript
19
star
28

javascript

Various JavaScript projects & tools.
JavaScript
17
star
29

knockout.dragdrop

A HTML5 drag and drop binding for Knockout.
JavaScript
16
star
30

text-align

jQuery Text Alignment plugin
JavaScript
13
star
31

tpo

Next generation of browser typesetting
JavaScript
13
star
32

closure-compiler-inline

A Closure Compiler fork with more control over function inlining
Java
11
star
33

calcdeps

A Node.js port of Google Closure library calcdeps.py
JavaScript
11
star
34

js-preprocess

JavaScript Preprocessor
JavaScript
9
star
35

column-selector

jQuery Column Selector
JavaScript
9
star
36

fonzie

A tiny @font-face loader
JavaScript
8
star
37

phantomjs-typekit

A simple demo of using Typekit with PhantomJS
JavaScript
8
star
38

epub2ts

ePub to Treesaver conversion
JavaScript
8
star
39

php-typekit

A PHP client for the Typekit API
PHP
7
star
40

shp2json

Simple tool to convert Shapefiles (GIS) to JSON
JavaScript
6
star
41

hyphenation-justification-vf

JavaScript
5
star
42

nanoserver

A simple web server for development
JavaScript
5
star
43

emfont

A font with a single character filling the entire em-box
HTML
5
star
44

ui

C++/OpenGL User Interface library
5
star
45

jslint-core

JSLint: The JavaScript Code Quality Tool packaged as a CommonJS module
JavaScript
5
star
46

mocha-browserstack

A Mocha reporter that can be used to run Mocha tests automatically on BrowserStack
JavaScript
4
star
47

sfnt2woff

C
4
star
48

node-browserstack

A Node.js client for the BrowserStack API (v3 and screenshot)
JavaScript
4
star
49

website

bramstein.com website source
JavaScript
3
star
50

unicode-data-parser

JavaScript
3
star
51

markup

JavaScript
2
star
52

closure-dom

JavaScript
2
star
53

ui-test

C++/OpenGL User Interface library test project
2
star
54

closureloader

Load code using the Closure library dependency syntax in Node.js
JavaScript
2
star
55

cssvalue

Parsers (and generators) for common CSS values.
JavaScript
2
star
56

thesis

Master Thesis: "Visualizations on the Web"
2
star
57

sowt-test

Automated browser tests for State of Web Type
JavaScript
2
star
58

closure-fetch

JavaScript
1
star
59

detect-writing-script

Detect the writing script given an array of codepoints.
JavaScript
1
star
60

ui-demo

C++/OpenGL User Interface library demo
1
star
61

font-weight-test

Test case for font-weight fallback behaviour
Makefile
1
star
62

amd-to-closure

Transform AMD modules to Closure Compiler dependencies
JavaScript
1
star
63

fontformatdetection

Detect browser support for font formats using feature detection
JavaScript
1
star