• Stars
    star
    431
  • Rank 100,866 (Top 2 %)
  • Language
    Scala
  • Created about 12 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Jawn is for parsing jay-sawn (JSON)

Jawn

"Jawn is for parsing jay-sawn."

Origin

The term "jawn" comes from the Philadelphia area. It conveys about as much information as "thing" does. I chose the name because I had moved to Montreal so I was remembering Philly fondly. Also, there isn't a better way to describe objects encoded in JSON than "things". Finally, we get a catchy slogan.

Jawn was designed to parse JSON into an AST as quickly as possible.

Latest version

Overview

Jawn consists of three parts:

  1. A fast, generic JSON parser (jawn-parser)
  2. A small, somewhat anemic AST (jawn-ast)
  3. A few helpful utilities (jawn-util)

Currently Jawn is competitive with the fastest Java JSON libraries (GSON and Jackson) and in the author's benchmarks it often wins. It seems to be faster than any other Scala parser that exists (as of July 2014).

Given the plethora of really nice JSON libraries for Scala, the expectation is that you're probably here for jawn-parser or a support package.

Quick Start

Jawn supports Scala 2.12, 2.13, and 3 on the JVM and Scala.js. Scala 2.12 and 2.13 are supported on Scala Native.

Here's a build.sbt snippet that shows you how to depend on Jawn in your own sbt project:

// use this if you just want jawn's parser, and will implement your own facade
libraryDependencies += "org.typelevel" %% "jawn-parser" % "1.3.2"

// use this if you want jawn's parser and also jawn's ast
libraryDependencies += "org.typelevel" %% "jawn-ast" % "1.3.2"

If you want to use Jawn's parser with another project's AST, see the "Supporting external ASTs with Jawn" section. There are a few reasons you might want to do this:

  • The library's built-in parser is significantly slower than Jawn's.
  • Jawn supports more input types (ByteBuffer, File, etc.).
  • You need asynchronous JSON parsing.

Dependencies

jawn-parser has no dependencies other than Scala.

jawn-ast depends on jawn-parser but nothing else.

Parsing

Jawn's parser is both fast and relatively featureful. Assuming you want to get back an AST of type J and you have a Facade[J] defined, you can use the following parse signatures:

Parser.parseUnsafe[J](String) โ†’ J
Parser.parseFromString[J](String) โ†’ Try[J]
Parser.parsefromPath[J](String) โ†’ Try[J]
Parser.parseFromFile[J](File) โ†’ Try[J]
Parser.parseFromChannel[J](ReadableByteChannel) โ†’ Try[J]
Parser.parseFromByteBuffer[J](ByteBuffer) โ†’ Try[J]

Jawn also supports asynchronous parsing, which allows users to feed the parser with data as it is available. There are three modes:

  • SingleValue waits to return a single J value once parsing is done.
  • UnwrapArray if the top-level element is an array, return values as they become available. Set multiValue to true if you want to support multiple top level arrays.
  • ValueStream parse one-or-more json values separated by whitespace.

Here's an example:

import org.typelevel.jawn.ast
import org.typelevel.jawn.AsyncParser
import org.typelevel.jawn.ParseException

val p = ast.JParser.async(mode = AsyncParser.UnwrapArray)

def chunks: Stream[String] = ???
def sink(j: ast.JValue): Unit = ???

def loop(st: Stream[String]): Either[ParseException, Unit] =
  st match {
    case s #:: tail =>
      p.absorb(s) match {
        case Right(js) =>
          js.foreach(sink)
          loop(tail)
        case Left(e) =>
          Left(e)
      }
    case _ =>
      p.finish().right.map(_.foreach(sink))
  }

loop(chunks)

You can also call Parser.async[J] to use async parsing with an arbitrary data type (provided you also have an implicit Facade[J]).

Supporting external ASTs with Jawn

Circe

circe is supported via its circe-parser module.

Argonaut

argonaut is supported via its argonaut-jawn module.

Do-It-Yourself Parsing

Jawn supports building any JSON AST you need via type classes. You benefit from Jawn's fast parser while still using your favorite Scala JSON library. This mechanism is also what allows Jawn to provide "support" for other libraries' ASTs.

To include Jawn's parser in your project, add the following snippet to your build.sbt file:

resolvers += Resolver.sonatypeRepo("releases")

libraryDependencies += "org.typelevel" %% "jawn-parser" % "1.3.2"

To support your AST of choice, you'll want to define a Facade[J] instance, where the J type parameter represents the base of your JSON AST. For example, here's a facade that supports Spray:

import spray.json._
object Spray extends SimpleFacade[JsValue] {
  def jnull() = JsNull
  def jfalse() = JsFalse
  def jtrue() = JsTrue
  def jnum(s: String) = JsNumber(s)
  def jint(s: String) = JsNumber(s)
  def jstring(s: String) = JsString(s)
  def jarray(vs: List[JsValue]) = JsArray(vs)
  def jobject(vs: Map[String, JsValue]) = JsObject(vs)
}

Most ASTs will be easy to define using the SimpleFacade or MutableFacade traits. However, if an ASTs object or array instances do more than just wrap a Scala collection, it may be necessary to extend Facade directly.

Extend SupportParser[J], supplying your facade as the abstract facade, to get convenient methods for parsing various input types or an AsyncParser.

Using the AST

Access

For accessing atomic values, JValue supports two sets of methods: get-style methods and as-style methods.

The get-style methods return Some(_) when called on a compatible JSON value (e.g. strings can return Some[String], numbers can return Some[Double], etc.), and None otherwise:

getBoolean โ†’ Option[Boolean]
getString โ†’ Option[String]
getLong โ†’ Option[Long]
getDouble โ†’ Option[Double]
getBigInt โ†’ Option[BigInt]
getBigDecimal โ†’ Option[BigDecimal]

In constrast, the as-style methods will either return an unwrapped value (instead of returning Some(_)) or throw an exception (instead of returning None):

asBoolean โ†’ Boolean // or exception
asString โ†’ String // or exception
asLong โ†’ Long // or exception
asDouble โ†’ Double // or exception
asBigInt โ†’ BigInt // or exception
asBigDecimal โ†’ BigDecimal // or exception

To access elements of an array, call get with an Int position:

get(i: Int) โ†’ JValue // returns JNull if index is illegal

To access elements of an object, call get with a String key:

get(k: String) โ†’ JValue // returns JNull if key is not found

Both of these methods also return JNull if the value is not the appropraite container. This allows the caller to chain lookups without having to check that each level is correct:

val v: JValue = ???

// returns JNull if a problem is encountered in structure of 'v'.
val t: JValue = v.get("novels").get(0).get("title")

// if 'v' had the right structure and 't' is JString(s), then Some(s).
// otherwise, None.
val titleOrNone: Option[String] = t.getString

// equivalent to titleOrNone.getOrElse(throw ...)
val titleOrDie: String = t.asString

Updating

The atomic values (JNum, JBoolean, JNum, and JString) are immutable.

Objects are fully-mutable and can have items added, removed, or changed:

set(k: String, v: JValue) โ†’ Unit
remove(k: String) โ†’ Option[JValue]

If set is called on a non-object, an exception will be thrown. If remove is called on a non-object, None will be returned.

Arrays are semi-mutable. Their values can be changed, but their size is fixed:

set(i: Int, v: JValue) โ†’ Unit

If set is called on a non-array, or called with an illegal index, an exception will be thrown.

(A future version of Jawn may provide an array whose length can be changed.)

Profiling

Jawn uses JMH along with the sbt-jmh plugin.

Running Benchmarks

The benchmarks are located in the benchmark project. You can run the benchmarks by typing benchmark/jmh:run from SBT. There are many supported arguments, so here are a few examples:

Run all benchmarks, with 10 warmups, 10 iterations, using 3 threads:

benchmark/jmh:run -wi 10 -i 10 -f1 -t3

Run just the CountriesBench test (5 warmups, 5 iterations, 1 thread):

benchmark/jmh:run -wi 5 -i 5 -f1 -t1 .*CountriesBench

Benchmark Issues

Currently, the benchmarks are a bit fiddily. The most obvious symptom is that if you compile the benchmarks, make changes, and compile again, you may see errors like:

[error] (benchmark/jmh:generateJavaSources) java.lang.NoClassDefFoundError: jawn/benchmark/Bla25Bench

The fix here is to run benchmark/clean and try again.

You will also see intermittent problems like:

[error] (benchmark/jmh:compile) java.lang.reflect.MalformedParameterizedTypeException

The solution here is easier (though frustrating): just try it again. If you continue to have problems, consider cleaning the project and trying again.

(In the future I hope to make the benchmarking here a bit more resilient. Suggestions and pull requests gladly welcome!)

Files

The benchmarks use files located in benchmark/src/main/resources. If you want to test your own files (e.g. mydata.json), you would:

  • Copy the file to benchmark/src/main/resources/mydata.json.
  • Add the following code to JmhBenchmarks.scala:
class MyDataBench extends JmhBenchmarks("mydata.json")

Jawn has been tested with much larger files, e.g. 100M - 1G, but these are obviously too large to ship with the project.

With large files, it's usually easier to comment out most of the benchmarking methods and only test one (or a few) methods. Some of the slower JSON parsers get much slower for large files.

Interpreting the results

Remember that the benchmarking results you see will vary based on:

  • Hardware
  • Java version
  • JSON file size
  • JSON file structure
  • JSON data values

I have tried to use each library in the most idiomatic and fastest way possible (to parse the JSON into a simple AST). Pull requests to update library versions and improve usage are very welcome.

Future Work

More support libraries could be added.

It's likely that some of Jawn's I/O could be optimized a bit more, and also made more configurable. The heuristics around all-at-once loading versus input chunking could definitely be improved.

In cases where the user doesn't need fast lookups into JSON objects, an even lighter AST could be used to improve parsing and rendering speeds.

Strategies to cache/intern field names of objects could pay big dividends in some cases (this might require AST changes).

If you have ideas for any of these (or other ideas) please feel free to open an issue or pull request so we can talk about it.

Disclaimers

Jawn only supports UTF-8 when parsing bytes. This might change in the future, but for now that's the target case. You can always decode your data to a string, and handle the character set decoding using Java's standard tools.

Jawn's AST is intended to be very lightweight and simple. It supports simple access, and limited mutable updates. It intentionally lacks the power and sophistication of many other JSON libraries.

Community

People are expected to follow the Scala Code of Conduct when discussing Jawn on GitHub or other venues.

Jawn's current maintainers are:

Copyright and License

All code is available to you under the MIT license, available at http://opensource.org/licenses/mit-license.php.

Copyright Erik Osheim, 2012-2022.

More Repositories

1

cats

Lightweight, modular, and extensible library for functional programming.
Scala
5,182
star
2

fs2

Compositional, streaming I/O library for Scala
Scala
2,359
star
3

doobie

Functional JDBC layer for Scala.
Scala
2,161
star
4

scalacheck

Property-based testing for Scala
Scala
1,908
star
5

cats-effect

The pure asynchronous runtime for Scala
Scala
1,817
star
6

spire

Powerful new number types and numeric abstractions for Scala.
Scala
1,761
star
7

skunk

A data access library for Scala + Postgres.
Scala
1,579
star
8

simulacrum

First class syntax support for type classes in Scala
Scala
937
star
9

squants

The Scala API for Quantities, Units of Measure and Dimensional Analysis
Scala
922
star
10

kind-projector

Compiler plugin for making type lambdas (type projections) easier to write
Scala
915
star
11

frameless

Expressive types for Spark.
Scala
879
star
12

cats-collections

Data structures for pure functional programming in Scala
Scala
557
star
13

kittens

Automatic type class derivation for Cats
Scala
531
star
14

log4cats

Logging Tools For Interaction with cats-effect
Scala
400
star
15

Laika

Site and E-book Generator and Customizable Text Markup Transformer for sbt, Scala and Scala.js
Scala
387
star
16

algebra

Experimental project to lay out basic algebra type classes
Scala
378
star
17

mouse

A small companion to cats
Scala
365
star
18

sbt-tpolecat

scalac options for the enlightened
Scala
328
star
19

discipline

Flexible law checking for Scala
Scala
328
star
20

natchez

functional tracing for cats
Scala
324
star
21

cats-tagless

Library of utilities for tagless final encoded algebras
Scala
314
star
22

cats-mtl

cats transformer type classes.
Scala
308
star
23

CT_from_Programmers.scala

Scala sample code for Bartosz Milewski's CT for Programmers
Scala
279
star
24

fs2-grpc

gRPC implementation for FS2/cats-effect
Scala
270
star
25

cats-parse

A parsing library for the cats ecosystem
Scala
233
star
26

machinist

Spire's macros for zero-cost operator enrichment
Scala
191
star
27

cats-effect-testing

Integration between cats-effect and test frameworks
Scala
191
star
28

shapeless-3

Generic programming for Scala
Scala
185
star
29

paiges

an implementation of Wadler's a prettier printer
Scala
183
star
30

grackle

Grackle: Functional GraphQL for the Typelevel stack
Scala
176
star
31

sbt-typelevel

Let sbt work for you.
Scala
170
star
32

munit-cats-effect

Integration library for MUnit & cats-effect
Scala
149
star
33

feral

Feral cats are homeless, feral functions are serverless
Scala
144
star
34

catbird

Birds and cats together
Scala
139
star
35

otel4s

An OpenTelemetry library for Scala based on Cats-Effect
Scala
138
star
36

fs2-chat

Sample project demonstrating use of fs2-io to build a chat client and server
Scala
123
star
37

spotted-leopards

Proof of concept for a cats-like library built using Dotty features
Scala
112
star
38

fabric

Object-Notation Abstraction for JSON, binary, HOCON, etc.
Scala
110
star
39

literally

Compile time validation of literal values built from strings
Scala
106
star
40

toolkit

Quickstart your next app with the Typelevel Toolkit!
Scala
94
star
41

cats-time

Cats Instances for Java Time
Scala
91
star
42

typelevel-nix

Development tools for Typelevel projects
Nix
87
star
43

cats-effect-cps

An incubator project for async/await syntax support for Cats Effect
Scala
81
star
44

vault

Type-safe, persistent storage for values of arbitrary types
Scala
81
star
45

shapeless-contrib

Interoperability libraries for Shapeless
Scala
79
star
46

scalacheck-effect

Effectful property testing built on ScalaCheck
Scala
76
star
47

coop

Cooperative multithreading as a pure monad transformer
Scala
68
star
48

claimant

Library to support automatic labeling of ScalaCheck properties.
Scala
68
star
49

typeclassic

Everything you need to make type classes first class.
Scala
61
star
50

scalaz-contrib

Interoperability libraries & additional data structures and instances for Scalaz
Scala
55
star
51

twiddles

Micro-library for building effectful protocols
Scala
55
star
52

monoids

Generic Monoids for Scala
Scala
51
star
53

fs2-netty

What it says on the tin!
Scala
47
star
54

sbt-catalysts

sbt utilities for open source projects
Scala
45
star
55

natchez-http4s

Glorious integration layer for Natchez and Http4s.
Scala
44
star
56

typelevel.github.com

Web site of typelevel.scala
HTML
40
star
57

jawn-fs2

Integration between jawn and fs2
Scala
38
star
58

keypool

A Keyed Pool Implementation for Scala
Scala
34
star
59

scalaz-specs2

Specs2 bindings for Scalaz
Scala
34
star
60

catalysts

Scala
34
star
61

simulacrum-scalafix

Simulacrum as Scalafix rules
Scala
34
star
62

case-insensitive

A case-insensitive string for Scala
Scala
34
star
63

scalaz-outlaws

outcasts no longer allowed in the ivory tower
Scala
28
star
64

bobcats

Typelevel's very own CryptoKitties!
Scala
28
star
65

scalac-options

A library for configuring scalac options
Scala
27
star
66

weaver-test

A test framework that runs everything in parallel.
Scala
27
star
67

ce3.g8

Scala
24
star
68

scalaz-scalatest

Scalatest bindings for scalaz.
Scala
23
star
69

general

Repository for general Typelevel information, activity and issues
19
star
70

discipline-munit

MUnit binding for Typelevel Discipline
Scala
18
star
71

cats-testkit-scalatest

Cats Testkit for Scalatest
Scala
18
star
72

unique

Unique Functional Values for Scala
Scala
17
star
73

discipline-scalatest

ScalaTest binding for Discipline
Scala
17
star
74

typelevel-scalafix

Scalafix rules for Typelevel projects
Scala
17
star
75

semigroups

Scala
16
star
76

cats-effect-shell

Command line debugging console for Cats Effect
Scala
15
star
77

jdk-index

A Jabba compatible index of JDK versions
Scala
14
star
78

cats-uri

URI implementation based on cats-parse with cats instances
Scala
14
star
79

typelevel.g8

A typelevel.g8 based on sbt-typelevel
Scala
14
star
80

catapult

Scala
13
star
81

discipline-specs2

Specs2 Integration for Discipline
Scala
9
star
82

governance

Typelevel governance
Scala
7
star
83

catz-cradle

Testbed for scala libraries and tools, based on examples from cats docs
Scala
7
star
84

spire-contrib

Interoperability libraries for spire
Shell
7
star
85

idna4s

Cross-platform Scala implementation of Internationalized Domain Names in Applications
Scala
7
star
86

scalac-compat

Lightweight tools for tackling Scalac version incompatibilities
Scala
6
star
87

steward

Runs Scala Steward for Typelevel projects
5
star
88

cats-effect-main

3
star
89

sacagawea

Common infrastructure for tracing functional effects
Scala
3
star
90

scalacheck-xml

Scalacheck instances for scala-xml
Scala
3
star
91

sorcery

WIP
2
star
92

scalacheck-web

ScalaCheck Web Site
Nix
2
star
93

sbt-catalysts.g8

Scala
2
star
94

feral.g8

Giter8 template for feral serverless
Scala
2
star
95

download-java

2
star
96

toolkit.g8

A Giter8 template for Typelevel Toolkit!
Scala
2
star
97

sbt-tls-crossproject

sbt-crossproject plugin for Typelevel Scala
Scala
1
star
98

await-cirrus

Depend on Cirrus CI from a GitHub Actions workflow
JavaScript
1
star
99

catalysts-docker

Shell
1
star
100

.github

a โœจspecial โœจ repository for project defaults and organization readme
1
star