• Stars
    star
    286
  • Rank 144,690 (Top 3 %)
  • Language
    JavaScript
  • License
    GNU Affero Genera...
  • Created about 10 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A parser for invalid JSON

dirty-json

Codeship Status for RyanMarcus/dirty-json Coverage Status NPM version

AGPL

npm install dirty-json

A JSON parser that tries to handle non-conforming or otherwise invalid JSON.

You can play around with a demo here: http://rmarcus.info/dirty-json/

You might also be interested in my blog post about the parser.

Turn this:

[5, .5, 'single quotes', "quotes in "quotes" in quotes"]

Into this:

[5,0.5,"single quotes","quotes in \"quotes\" in quotes"]

Why?

We all love JSON. But sometimes, out in that scary place called "the real world", we see something like this:

{ "user": "<div class="user">Ryan</div>" }

Or even something like this:

{ user: '<div class="user">
Ryan
</div>' }

While these are obviously cringe-worthy, we still want a way to parse them. dirty-json provides a library to do exactly that.

Examples

dirty-json does not require object keys to be quoted, and can handle single-quoted value strings.

const dJSON = require('dirty-json');
const r = dJSON.parse("{ test: 'this is a test'}")
console.log(JSON.stringify(r));

// output: {"test":"this is a test"}

dirty-json can handle embedded quotes in strings.

const dJSON = require('dirty-json');
const r = dJSON.parse('{ "test": "some text "a quote" more text"}');
console.log(JSON.stringify(r));

// output: {"test":"some text \"aquote\" more text"}

dirty-json can handle newlines inside of a string.

const dJSON = require('dirty-json');
const r = dJSON.parse('{ "test": "each \n on \n new \n line"}');
console.log(JSON.stringify(r));

// output: {"test":"each \n on \n new \n line"}

Optionally, dirty-json can handle duplicate keys differently from standard JSON.

const dJSON = require('dirty-json');
const r = dJSON.parse('{"key": 1, "key": 2, \'key\': [1, 2, 3]}');
console.log(JSON.stringify(r));
// output: {"key": [1, 2, 3]}

const r = dJSON.parse('{"key": 1, "key": 2, \'key\': [1, 2, 3]}', {"duplicateKeys": true});
console.log(JSON.stringify(r));
// output: { key: { value: { value: 1, next: 2 }, next: [ 1, 2, 3 ] } }

But what about THIS ambiguous example?

Since dirty-json is handling malformed JSON, it will not always produce the result that you "think" it should. That's why you should only use this when you absolutely need it. Malformed JSON is malformed for a reason.

How does it work?

Currently dirty-json uses a lexer powered by lex and a hand-written LR(1) parser. It shouldn't be used in any environment that requires reliable or fast results.

Security concerns

This package makes heavy use of regular expressions in its lexer. As a result, it may be vulnerable to a REDOS attack. Versions prior to 0.5.1 and after 0.0.5 were definitely vulnerable (thanks to Jamie Davis for pointing this out). I believe version 0.5.1 and later are safe, but since I do not know of any tool to verify a RegEx, I can't prove it.

Acknowledgements

Thanks to user Moai- and 0x0a0dfor fixing array prototype leakage.

License

Copyright 2020, 2018, 2016, 2015, 2014 Ryan Marcus dirty-json is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

dirty-json is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with dirty-json. If not, see http://www.gnu.org/licenses/.

More Repositories

1

imdb_pg_dataset

A Vagrant box that automatically loads the IMDB dataset into Postgres
Shell
60
star
2

vulcan

A JavaScript propositional logic and resolution library
JavaScript
58
star
3

TreeConvolution

PyTorch implementation of binary tree convolution
Python
45
star
4

plr

Error-bounded piecewise linear regression
Rust
26
star
5

EdgarAllanPoetry

Computer-generated poetry
CSS
22
star
6

humanLines

Comptuer-generated lines with a human feel
JavaScript
21
star
7

perlin

Perlin noise generator in Rust
Rust
16
star
8

connect4

A Connect 4 AI in C, Python, and JavaScript
C
12
star
9

basicaf

A BASIC to Brainf**k compiler written in Rust
Rust
12
star
10

HortonHashing

An implementation of Horton hash tables
Java
10
star
11

jchain

A Java blockchain database implementation
Java
9
star
12

all-hail-the-mighty-json

Programming in JSON
JavaScript
9
star
13

pg_session_stats

Track PostgreSQL resource usage across a session, including parallelism.
C
7
star
14

fast64

Implementation of FAST (fast architecture sensitive tree search) for 64-bit keys
Rust
6
star
15

humor-generation-papers

A list of papers for joke and humor generation
5
star
16

node-deluge

A simple Deluge NodeJs API interface/wrapper, asynchronous NodeJs module to interact with the Deluge torrent client API.
JavaScript
5
star
17

fourier_animations

Python
4
star
18

wisedb

A learning-based workload management advisor for cloud databases
Java
4
star
19

weierstrass

Numeric Weierstrass transform in Javascript
JavaScript
3
star
20

nashdb

Economics-inspired end-to-end cloud database provisioning, replication, and fragmentation
Java
2
star
21

lyapunov

Lyapunov fractal generator
JavaScript
2
star
22

tex2svg

Python
2
star
23

dynamic-critical-path

Clojure
2
star
24

rdtheory

Functional dependency algebra and reasoning
JavaScript
1
star
25

byoo

Rust
1
star
26

EPtoSQL

Compiles DevelOPs execution plan into SQL Server statements
Java
1
star
27

skill-calendar

A NodeJS program to create skill calendars
JavaScript
1
star
28

headfi_nlp

A tool to help navigate insanely long Head-Fi threads
Python
1
star
29

rmi_pgm

A simple benchmark of RMIs and PGMs to test on different hardware
C++
1
star
30

multijoin

A utility to join together multiple CSV files on their first column
JavaScript
1
star
31

permutation_compression

A Rust library for compressing permutations
Rust
1
star