pyramid
A library for storing and querying graph data in Clojure.
Features:
- Graph query engine that works on any in-memory data store
- Algorithm for taking trees of data and storing them in normal form in Clojure data.
Install & docs
Why
Clojure is well known for its graph databases like datomic and datascript which implement a datalog-like query language. There are contexts where the power vs performance tradeoffs of these query languages don't make sense, which is where pyramid can shine.
Pyramid focuses on doing essential things like selecting data from and traversing relationships between entities, while eschewing arbitrary logic like what SQL and datalog provide. What it lacks in features it makes up for in read performance when combined with a data store that has fast in-memory look ups of entities by key, such as Clojure maps. It can also be extended to databases like Datomic, DataScript and Asami.
What
Pyramid can be useful at each evolutionary stage of a program where one needs to traverse and make selections out of graphs of data.
Selection
Pyramid starts by working with Clojure data structures. A simple program that
uses pyramid can use a query to select specific data out of a large, deeply
nested tree, like a supercharged select-keys
.
(def data
{:people [{:given-name "Bob" :surname "Smith" :age 29}
{:given-name "Alice" :surname "Meyer" :age 43}]
:items {}})
(def query [{:people [:given-name]}])
(pyramid.core/pull data query)
;; => {:people [{:given-name "Bob"} {:given-name "Alice"}]}
Transformation
Pyramid combines querying with the Visitor pattern
in a powerful way, allowing one to easily perform transformations of selections
of data. Simply annotate parts of your query with metadata
{:visitor (fn visit [data selection] ,,,)}
and the visit
function will be
used to transform the data in a depth-first, post-order traversal (just like
clojure.walk/postwalk
).
(def data
{:people [{:given-name "Bob" :surname "Smith" :age 29}
{:given-name "Alice" :surname "Meyer" :age 43}]
:items {}})
(defn fullname
[{:keys [given-name surname] :as person}]
(str given-name " " surname))
(def query [{:people ^{:visitor fullname} [:given-name :surname]}])
(pyramid.core/pull data query)
;; => {:people ["Bob Smith" "Alice Meyer"]}
Accretion
A more complex program may need to keep track of that data over time, or query
data that contains cycles, which can be done by creating a pyramid.core/db
.
"Pyramid dbs" are maps that have a particular structure:
;; for any entity identified by `[key id]`, it follows the shape:
{key {id {,,,}}
Adding data to a db will normalize the data into a flat structure allowing for easy updating of entities as new data is obtained and allow relationships that are hard to represent in trees. Queries can traverse the references inside this data.
See docs/GUIDE.md.
Durability
A program may grow to need durable storage and other features that more full
featured in-memory databases provide. Pyramid provides a protocol, IPullable
,
which can be extended to allow queries to run over any store that data can be
looked up by a tuple, [primary-key value]
. This is generalizable to most
databases like Datomic, DataScript, Asami and SQLite.
Full stack
The above shows the evolution of a single program, but many programs never grow beyond the accretion stage. Pyramid has been used primarily in user interfaces where data is stored in a data structure and queried over time to show different views on a large graph. Through its protocols, it can now be extended to be used with durable storage on the server as well.
Concepts
Query: A query is written using EQL, a query language implemented inside Clojure. It provides the ability to select data in a nested, recursive way, making it ideal for traversing graphs of data. It does not provide arbitrary logic like SQL or Datalog.
Entity map: a Clojure map which contains information that uniquely identifies
the domain entity it is about. E.g. {:person/id 1234 :person/name "Bill" :person/age 67}
could be uniquely identified by it's :person/id
key. By
default, any map which contains a key which (= "id" (name key))
is true, is an
entity map and can be normalized using pyramid.core/db
.
Ident function: a function that takes a map and returns a tuple [:key val]
that uniquely identifies the entity the map describes.
Lookup ref: a 2-element Clojure vector that has a keyword and a value which
together act as a pointer to a domain entity. E.g. [:person/id 1234]
.
pyramid.core/pull
will attempt to look them up in the db value if they appear
in the result at a location where the query expects to do a join.
Usage
See docs/GUIDE.md
Prior art
- Fulcro
- @souenzzo's POC eql-refdb
- MapGraph
- EntityDB
- DataScript and derivatives
- Pathom
- juxt/pull
- ribelo/doxa
Copyright
Copyright Β© 2023 Will Acton. Distributed under the EPL 2.0.