Parsley
What is Parsley?
Parsley is a fast and modern parser combinator library for Scala based loosely on a Haskell-style parsec
API.
How do I use it?
Parsley is distributed on Maven Central, and can be added to your project via:
libraryDependencies += "com.github.j-mie6" %% "parsley" % "4.2.11"
Documentation can be found here
If you're a cats
user, you may also be interested in using parsley-cats
to augment parsley
with instances for various cats
typeclasses:
libraryDependencies += "com.github.j-mie6" %% "parsley-cats" % "1.2.0"
Examples
scala> import parsley.Parsley
scala> import parsley.implicits.character.{charLift, stringLift}
scala> val hello: Parsley[Unit] = ('h' *> ("ello" <|> "i") *> " world!").void
scala> hello.parse("hello world!")
val res0: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hi world!")
val res1: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hey world!")
val res2: parsley.Result[String,Unit] =
Failure((line 1, column 2):
unexpected "ey"
expected "ello"
>hey world!
^^)
scala> import parsley.character.digit
scala> val natural: Parsley[Int] = digit.foldLeft1(0)((n, d) => n * 10 + d.asDigit)
scala> natural.parse("0")
val res3: parsley.Result[String,Int] = Success(0)
scala> natural.parse("123")
val res4: parsley.Result[String,Int] = Success(123)
For more see the Wiki!
parsec
?
What are the differences to Haskell's Mostly, this library is quite similar. However, due to Scala's differences in operator characters a few operators are changed:
(<$>)
is known asmap
try
is known asattempt
(<$)
and($>)
are<#
and#>
respectively.
In addition, lift2
and lift3
are uncurried in this library: this is to provide better performance and easier usage with
Scala's traditionally uncurried functions. There are also a few new operators in general to be found here!
Library Evolution
Parsley is a modern parser combinator library, which strives to be on the bleeding-edge of parser combinator library design. This means that improvements will come naturally over time. Feel free to suggest improvements for consideration, as well as high-level problems you commonly encounter that we may be able to find a way to mitigate (see the Design Patterns for Parser Combinators paper for example!).
Frequency of Major Changes
Part of innovation is being willing to admit
design mistakes and rectify them: when a binary-breaking release is made, the
opportunity may be taken to polish parts of the libary's API that are clunky, or
could be better organised or improved. For example, see the differences between
parsley-3.3.10
and parsley-4.0.0
! However, constant breaking changes are
not a good way to encourage the use of a library as users often want stability:
to that end, annoyances and bugbears with the API are only addressed
approximately yearly, and the frequence of these will decrease over time.
For future major releases, care will be taken to, wherever possible, publish
all patch-level changes in a final version to the previous major.minor
version, and then all minor-level changes as a final major.(minor+1).0
version before releasing the major-level changes as (major+1).0.0
: this will
allow users stuck on the old version to benefit as much as possible from the
fixes and new functionality.
Versioning Policy
As of 4.0.0
, parsley
is strictly commited to early-semver
, which means
that the version numbers are significant:
- Two versions
x._._
andy._._
withx != y
are incompatible with each other at a binary level: havingx._._
on the classpath with code compiled with they._._
will most likely result in a linkage-error at runtime. - Two versions
a.x._
anda.y._
withx <= y
are binary compatible, which means that code compiled againsta.x._
will still work witha.y._
on the classpath. A "source" componenty > x
indicates thata.y._
has added or deprecated functionality sincea.x._
. - Two versions
a.b.x
anda.b.y
are binary and source compatible, which means there are no compatiblity concerns between the two versions. Code compiled againsta.b.x
will run witha.b.y
on the classpath and vice-versa. A "patch" componenty > x
indicates thata.b.y
fixes issues (bugs or poor performance) witha.b.x
.
In short, if you are on version a.x.y
, you can: feel free to upgrade to
version a.x.z
if z > y
without worry; and upgrade to a.z._
if z > x
,
with a possible (but rare) need to update your code minorly. Occasionally,
a "source" component bump may deprecate functionality, but it will provide a
migration to tell you how to avoid the deprecation warning. Altered/deprecated
functionality may be hidden from the public API in a binary backwards
compatible way in a "source" bump and therefore may require updating when
recompiled; this will be done sparingly and with minimal disruption as to not
discourage updating the libary, and any immediate migration changes to user
code from a.x._
to any a.y._
with y > x
will be documented in
a.y._
's release.
Note: all functionality marked as private [parsley]
or within
the parsley.internal
package is not adherent to early-semver
and may be
removed or changed at will with no impact to regular/intended use of the
library.
Release Candidates and Milestones
Occasionally, a minor (source) release will contain either a significant body of new work, or a significant rework of some internal machinery. In these cases additional versioning may be employed:
- Experimental (and volatile) new functionality may be iterated with
a.b.0-Mn
versions: these are (hopefully) working pre-release versions of the functionality, subject to even binary incompatible changes betweenM
versions. When the new API and behaviour becomes stable, the release graduates to thea.b.0-RC1
release candidate. - Release candidates are used to iron-out any lingering issues with a minor
release and potentially alter the finer-points of the new functionality's
behaviour. Binary compatiblity will be preserved between
RCx
andRCy
withy > x
except within truly exceptional circumstances. - Finally, the release makes it to
a.b.0
and is hopefully truly stable.
Version EoL (End of Life) Policy
Old versions of the library may still be given important bug-fixes after it has be obsoleted by a new release. In exceptional circumstances, performance problems may be addressed for old versions. The lifetime policy is as follows:
- Major (binary) versions reach EoL a minimum of 6 months after its successor was released, unless an extension to its life is requested by a issue.
- Minor (source) versions reach EoL immediately on the release of its successor, unless deprecations were issued by its successor, in which case it will reach EoL after a minimum of 3 months.
Some more minor bugfixes may not be ported to previous versions if they (a) do not appear in that version or (b) the code has changed too much internally to make porting feasible.
An exception to this policy is made for any version 3.x.y
, which reaches EoL effective immediately (December 2022) excluding exceptional circumstances.
Version | Released On | EoL Status |
---|---|---|
3.3.0 |
January 7th 2022 | EoL reached (3.3.10 ) |
4.0.0 |
November 30th 2022 | EoL reached (4.0.4 ) |
4.1.0 |
January 18th 2023 | EoL reached (4.1.8 ) |
4.2.0 |
January 22th 2023 | Enjoying indefinite support |
Bug Reports
If you encounter a bug when using Parsley, try and minimise the example of the parser (and the input) that triggers the bug. If possible, make a self contained example: this will help to identify the issue without too much issue.
How does it work?
Parsley represents parsers as an abstract-syntax tree AST, which is constructed lazily. As a result, Parsley is able to perform analysis and optimisations on your parsers, which helps reduce the burden on you, the programmer. This representation is then compiled into a light-weight stack-based instruction set designed to run fast on the JVM. This is what offers Parsley its competitive performance, but for best effect a parser should be compiled once and used many times (so-called hot execution).
To make recursive parsers work in this AST format, you must ensure that recursion is done by knot-tying: you should define all
recursive parsers with val
and introduce lazy val
where necessary for the compiler to accept the definition.
References
- This work is based on my Master's Thesis (2018) which can be found here
- This work spawned a paper at the Scala Symposium at ICFP 2018: Garnishing Parsec with Parsley
- This work supports the patterns introduced at the Scala Symposium in 2022: Design Patterns for Parser Combinators in Scala