• Stars
    star
    316
  • Rank 132,587 (Top 3 %)
  • Language
    Clojure
  • License
    Eclipse Public Li...
  • Created over 10 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Flexible datastore migration and seeding for Clojure projects

Joplin

Joplin 0.3 contains many breaking changes over the 0.2 releases, please see here for details.

Joplin is a library for flexible datastore migration and seeding.

Joplin tries to solve the problems that arise when dealing with complicated systems consisting of multiple datastores. It lets you define and reason about environments (for instance dev, staging, UAT, prod).

Joplin lets you declare your databases, migrators, seed functions up front and combine them into different environments. It can be used from the REPL, via a leiningen aliases or be called programatically.

Joplin comes with plugins for SQL/JDBC databases, Datomic, ElasticSearch, Cassandra, DynamoDB, Hive and Zookeeper. It is built with extensibility in mind, adding more stores is done by a handful of multi-methods.

Joplin is built on top of ragtime.

Libraries

  • joplin.core - database independent tools and functions for managing migrations and seeds
  • joplin.cassandra - migrate and seed Cassandra clusters
  • joplin.dynamodb - migrate and seed DynamoDB clusters
  • joplin.datomic - migrate and seed Datomic databases
  • joplin.elasticsearch - migrate and seed Elasticsearch clusters
  • joplin.hive - migrate and seed Hive tables using Avro
  • joplin.jdbc - migrate and seed SQL databases with jdbc
  • joplin.zookeeper - seed Zookeeper clusters
  • clj-rethinkdb-migrations - migrate and seed RethinkDB clusters
  • joplin.mongodb - migrate and seed MongoDB

Installation

Add joplin.core as a dependency if you just want the database-independent core:

:dependencies [[joplin.core "0.3.11"]]

If you are not using Leiningen, add a dependency for the plugins of the databases you want to migrate;

:dependencies [[joplin.cassandra "0.3.11"]
               [joplin.dynamodb "0.3.11"]
               [joplin.datomic "0.3.11"]
               [joplin.elasticsearch "0.3.11"]
               [joplin.hive "0.3.11"]
               [joplin.jdbc "0.3.11"]
               [joplin.zookeeper "0.3.11"]]

Using Joplin

Defining the configuration map

The joplin configuration map contains the keys :databases, :migrators, :seeds and :environments. The first three are pure declarations and :environments is where these declarations gets combined. For more details on these 4 keys see the Concepts wiki page.

Typically you would define the configuration in a .edn file that you put somewhere on your classpath.

Please note that the folders with migrators and seed-var namespaces needs to be in the classpath in order for joplin to access them. Joplin will also look into any jar files on the classpath for resource folders with matching name.

Example of a joplin definition;

{
:migrators {:sql-mig "joplin/migrators/sql"}  ;; A path for a folder with migration files

:seeds {:sql-seed "seeds.sql/run"             ;; A clojure var (function) that applies the seed
        :es-seed "seeds.es/run"}

:databases {:sql-dev {:type :jdbc, :url "jdbc:h2:mem:dev"}
            :es-prod {:type :es, :host "es-prod.local", :port "9300", :cluster "dev"}
            :sql-prod {:type :jdbc, :url "jdbc:h2:file:prod"}}

;; We combine the definitions above into different environments
:environments {:dev [{:db :sql-dev, :migrator :sql-mig, :seed :sql-seed}]
               :prod [{:db :sql-prod, :migrator :sql-mig}
                      {:db :es-prod}, :seed :es-seed]}
}

Hiding secrets from the configuration map

Once the config map has been written to a .edn file you'll need to read that file and pass the datastructure to joplin's migration and seed functions. JDBC connection strings typically include username and password that you don't want to put into source control.

The joplin.repl namespace provides a function called load-config that take the name (or URL) of a file on the classpath (joplin.edn) and reads the contents turning it into Clojure datastructures. This function implements 2 tagged literals to solve the problem with secrets in the configuration; env and envf

:es-dev      {:type :es, :host #env TARGET_HOST, :port 9200}
:psql-prod   {:type :sql, :url #envf ["jdbc:postgresql://psq-prod/prod?user=%s&password=%s"
                                      PROD_USER PROD_PASSWD]}
  • env takes a single name of an environment variable that is put in its place in the result of the load-config call.
  • envf takes a vector of data, the first item is a string passed to Clojure's format function. The rest of the items in the vector are names of environment variables that will be looked up and passed along with the format string to the format call.

Overriding the migration table

For jdbc / sql migrators the migration table is called ragtime_migrations by default. You can change this by adding a key called :migrations-table to the database map.

Running migrations

The joplin.repl namespace defines a function for each of Joplin's 5 basic operations. It's common to use these functions to setup a 'migration and seed REPL' from which you can control your databases.

  • migrate [conf env & args]

Run all pending (up) migrations on either all databases in the environment or a single if the database is provided as the first argument after env. This operation is idempotent.

  • seed [conf env & args]

Run the seed functions on either all databases in the environment or a single if the database is provided as the first argument after env. This operation is NOT idempotent, running seed functions multiple time can lead to duplication of data.

Joplin will only run seed functions on databases that have been fully migrated, i.e. have no pending migrations.

  • rollback [conf env database amount-or-id & args]

Rollback N migrations (down migrate) from a single database in an environment. N can either be a number (the number to migration steps you want to roll back, or an id of a migration you want to roll back to). This command requires that you specify a db.

  • reset [conf env database & args]

Rollback ALL migrations, apply ALL up migrations and call the seed function for a single database in an environment. Intended to reset a database to a good state. This command require that you specify a db.

Note that reset will only work properly if all down migrations do a complete cleanup of the action taken by the up migration.

  • pending [conf env database & args]

Print pending migrations for a database in an environment.

  • create [conf env database & args]

Create a new migration scaffold file for a single database in an environment. This command requires that you specify both a database and identification string of the migrations.

Running migrations from the command line

When you want to script your migrations requiring a REPL is not convenient. In this case you can use Leiningen aliases to provide command-line commands for the 5 operation above. The example project shows to do do this in the project.clj and the code the aliases uses.

Migration conflict strategies

The extra args of the joplin.repl functions are used to pass extra options to ragtime, this is currently used for the migrate operation where different strategies for dealing with migration conflicts can be specified.

(migrate config :dev :cass-dev {:strategy ragtime.strategy/rebase})

Read here for info on available strategies.

Writing migrators

Joplin migrators defaults to be 'code driven'. They are basic Clojure files and allows the migrators to be as simple or complicated as needed. If you are working with a SQL database, there is a second flavor of migrators at your disposal; ragtime-style sql migrators.

Joplin default migrators

A migrator consist of a single clojure source file. This file must contain (at least) 2 function definitions, one called up and one called down. The migrators will be called with one argument, which is the Ragtime DataStore record created by the joplin plugin. These record will contain the information you need to make a connections to the database and inflict a change.

Example of migrator;

$ ls -1 migrators/cass
20140717174605_users.clj
$ cat migrators/cass/20140717174605_users.clj
(ns migrators.cass.20140717174605-users
  (:use [joplin.cassandra.database])
  (:require [clojurewerkz.cassaforte.cql    :as cql]
            [clojurewerkz.cassaforte.query  :as cq]))

(defn up [db]
  (let [conn (get-connection (:hosts db) (:keyspace db))]
    (cql/create-table conn "users"
                      (cq/column-definitions {:id :varchar
                                              :email :varchar
                                              :primary-key [:id]}))))

(defn down [db]
  (let [conn (get-connection (:hosts db) (:keyspace db))]
    (cql/drop-table conn "users")))

Read the project.clj file for the corresponding joplin plugin to see what clojure database libraries can be used in the migrators.

SQL migrators

When migrating SQL databases you have 2 flavors of migrators at your disposal. You may specify your migrations with two text files (one up, one down) as shown below:

$ ls -1 migrators/sql
20120903221323-add-test-table.down.sql
20120903221323-add-test-table.up.sql
$ cat migrators/sql/20120903221323-add-test-table.up.sql
CREATE TABLE test_table (id INT);

For this type of migration, use the Joplin database type :sql.

You may also specify your migrations as Clojure namespaces, like the example for Cassandra above, by using the Joplin database type :jdbc. See the example project for details.

Please note as of Joplin 0.3 if you have multiple SQL statements in a single .sql file, you need to insert a marker --;; between them.

Writing seed functions

Seed functions are always placed in a clojure source file. The source file can contain any number of functions, since a seed definition is the name of a clojure var referencing a function.

The seed function are called with the target map (see below) as its first parameter and any number of additional parameter that were passed into the lein plugin invocation (or to the multi-method if called programatically).

Example of a seed function;

(ns seeds.dt
  (:require [datomic.api :as d]))

(defn run [target & args]
  (let [conn (d/connect (-> target :db :url))]
    @(d/transact conn [{:db/id #db/id [:db.part/user -100]
                        :foobar/id 42}])))

The target map will consist of 3 keys; :db, :migrator, :seed. The contents of these keys will be the (looked up) values of the keys specified in the environment map in the joplin definition.

Calling Joplin from your code

Calling joplin from your code requires that you have loaded the joplin.core namespace and also the namespaces of the database plugins you intend to use. Then you can call one of 5 multi-methods to perform the same 5 actions as described above in the leiningen plugin section. The multi-methods are; joplin.core/migrate-db, joplin.core/rollback-db, joplin.core/seed-db, joplin.core/pending-migrations, joplin.core/create-migration.

All the multi-methods takes a target map as its first argument and additional optional arguments.

The target map must have this shape;

{:db {:type DB-TYPE, DB-SETTING1: "foo", :DB-SETTING2: "bar", ...}
 :migrator "path to a folder on the classpath"
 :seed "name of a var in a namespace on the classpath"
 }

The :migrator and/or :seed keys are optional (joplin can't do much if both are missing). The :DB-SETTING keys vary for the different database types.

For example;

(ns example
  (:require [joplin.core :as joplin]
            [joplin.dt.database]))

(joplin/migrate-db
  {:db {:type :dt,
        :url "datomic:mem://test"}
   :migrator "joplin/migrators/datomic"
   :seed "seeds.dt/run"})

Provided database types and their respective plugins;

{:db {:type ? ... :require [?]
:jdbc joplin.jdbc.database
:sql joplin.jdbc.database
:es joplin.elasticsearch.database
:zk joplin.zookeeper.database
:dt joplin.dt.database
:cass joplin.cassandra.database
:dynamo joplin.dynamodb.database

Note that it's easy to extend joplin to handle more database types and thus introduce more valid database types.

Hacking joplin

Joplin uses the lein-sub plugin, this makes it easy to install all projects locally with lein sub install.

Documentation

Wiki

License

Copyright © 2015 Martin Trojer

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

More Repositories

1

bidi

Bidirectional URI routing
Clojure
991
star
2

aero

A small library for explicit, intentful configuration.
Clojure
743
star
3

yada

A powerful Clojure web library, full HTTP, full async - see https://juxt.pro/yada/index.html
HTML
735
star
4

tick

Time as a value.
Clojure
599
star
5

edge

A Clojure application foundation from JUXT
Clojure
503
star
6

juxt-accounting

Double-entry accounting software written in Clojure with Datomic.
Clojure
281
star
7

pack.alpha

Package clojure projects
Java
258
star
8

mach

A remake of make (in ClojureScript)
Clojure
246
star
9

jig

Jig is an application harness providing a beautifully interactive development experience for Clojure projects.
Clojure
229
star
10

clip

Light structure and support for dependency injection
Clojure
228
star
11

iota

Infix operators for test assertions
Clojure
150
star
12

site

A web and API server, powered by xtdb.com
Clojure
135
star
13

modular

JavaScript
128
star
14

apex

A compendium of Clojure libraries for implementing web backends.
Clojure
123
star
15

bolt

An integrated security system for applications built on component
Clojure
123
star
16

jinx

jinx is not xml-schema (it's json-schema!)
Clojure
99
star
17

pull

Trees from tables
Clojure
96
star
18

roll

AWS Blue/Green deployment using Clojure flavoured devops
Clojure
77
star
19

reap

A Clojure library for decoding and encoding strings used by web protocols.
Clojure
74
star
20

dirwatch

A Clojure directory watcher, wrapping the JDK 7 java.nio.file.WatchService.
Clojure
67
star
21

skip

Skippy McSkipface - A general Clojure dependency tracker
Clojure
34
star
22

snap

Snapshot testing for Clojure and Clojurescript
Clojure
30
star
23

pack-datomic

Datomic Packer and Terraform setup
HCL
26
star
24

qcon2014

JavaScript
26
star
25

stoic

Marrying the Component Lifecycle pattern with distributed config.
Clojure
20
star
26

spin

Unbundling the web application-tier
Clojure
19
star
27

pick

A Clojure library for HTTP server-driven content negotiation.
Clojure
18
star
28

prop

Clojure
17
star
29

grab

A Clojure library for parsing and executing GraphQL queries.
Clojure
16
star
30

rest

A guide to coding a REST service
CSS
15
star
31

chatserver

Clojure
13
star
32

shop

The JUXT Shop - a sample application built on Crux
Clojure
11
star
33

fuzz

Example ClojureScript Project that happens to wrap Slack
Clojure
9
star
34

dotfiles

Recommended dotfiles used by JUXT
Emacs Lisp
9
star
35

ramp

Shell
8
star
36

card

Not sure what this is. A kind-of Zettelkasten?
TypeScript
8
star
37

chatclient

Clojure
8
star
38

hire

A technical interview test for potential employees
Clojure
7
star
39

rock

Hardened AMIs for Clojure deployments (Arch Linux)
Shell
7
star
40

clip-example

Example of using Clip
Clojure
7
star
41

datomic-extras

Clojure
7
star
42

lein-dockerstalk

Clojure
7
star
43

docker

Docker files
Makefile
7
star
44

plain-clojure-lambda

A recipe for building AWS Lambdas using plain clojure
Clojure
7
star
45

draw

A DSL for SVG diagrams
Clojure
6
star
46

trag

Graph transduction
Clojure
5
star
47

radar

JUXT Clojure Technology Radar
Clojure
5
star
48

pack-ek

HCL
4
star
49

adoc

A Clojure wrapper for asciidoctorj
Clojure
4
star
50

msf

Volunteer work for MSF
Clojure
4
star
51

astro-website

Source for the main JUXT website
JavaScript
3
star
52

vext

Clojure
3
star
53

azondi

MQTT goes reactive
CSS
3
star
54

pack-riemann

HCL
3
star
55

yada.cookbook

A cookbook of yada recipes
3
star
56

component-utils

Clojure
2
star
57

blip

Small library for fetching/injecting graphql definitions
Clojure
2
star
58

ping

Clojure
2
star
59

modular.co-dependency

Co-dependency support for com.stuartsierra.component
Clojure
2
star
60

http

2
star
61

lein-deploy-tar

A Leiningen plugin to upload tars to a Maven repository.
Clojure
2
star
62

aleph-issue

Testing aleph
Clojure
1
star
63

mksmarthack

A Clojure data hack on MK:Smart
Clojure
1
star
64

eventing-examples

Clojure
1
star
65

flow

Clojure
1
star
66

training-exercises-webapp

Training exercises
Clojure
1
star
67

example-code

Example code for a recent client course
Clojure
1
star
68

home-apps

Monorepo for SPAs that use home.juxt.site as a backend
TypeScript
1
star
69

sail

Clojure
1
star
70

clojars-mirrors

Re-releasing Clojars libraries onto Maven Central
Clojure
1
star
71

XT20.conf

Public materials for XT20
1
star
72

asciidoctor-stylesheet-factory

CSS
1
star