Making Clojure Even Sweeter
⇐ Full API docs (click for cljdoc.org)
API Docs on GitHub Pages (codox)
Old-styleTupelo Overview
Have you ever wanted to do something simple but clojure.core doesn’t support it? Or, maybe you are wishing for an enhanced version of a standard function. If so, then Tupelo is for you! Tupelo is a library of helper and convenience functions to make working with Clojure simpler, easier, and more bulletproof.
Tupelo Organization
The functionality of the Tupelo library is divided into a number of namespaces, each with a single area of focus. These are:
Tupelo Core - A library of helper functions for core Clojure.
Please see the tupelo.core docs further below.
Tupelo-Forest - A library for searching & manipulating tree-like data structures
Please see the tupelo.forest docs for further information.
Tupelo-Datomic - A library of helper functions for Datomic.
The tupelo-datomic library has been split out into an independent project. Please see the tupelo-datomic project for details.
Tupelo CSV - Functions for using CSV (Comma Separate Value) files
The standard clojure-csv library has well-tested and useful functions for parsing CSV (Comma Separated Value) text data, but it does not offer all of the convenience one may wish. Tupelo CSV emphasizes the idomatic Clojure usage of data, using sequences and maps. Please see the tupelo.csv docs.
Tupelo Parse - Functions to ease text parsing
Please see the tupelo.parse docs.
Tupelo String - Functions to ease string operations
Please see the tupelo.string docs.
Tupelo Schema - Type Definitions
Enables type checking in Clojure code via Plumatic Schema. Please see the source code for definitions, and the James Bond example code for examples of the type-checking in action.
Tupelo Types - A collection of functions for testing object types
Please see the tupelo.types docs.
Tupelo Misc - A grab bag of functions that don’t fit anywhere else (yet!)
Please see the tupelo.misc docs.
tupelo.base64 - Convert to/from base64 encoding.
Please see the tupelo.base64 docs.
tupelo.base64url - Convert to/from base64url encoding.
Please see the tupelo.base64url docs.
Tupelo Y64 - Convert to/from the URL-safe Y64 encoding (Yahoo YUI library).
Please see the tupelo.y64 docs.
Tupelo Core Overview
Have you ever wanted to do something simple but clojure.core
doesn’t support it? Or, maybe
you are wishing for an enhanced version of a standard function. The goal of tupelo.core
is to
add support for these convenience features, so that you have a simple way of using either
the enhanced version or the original version.
The goal in using tupelo.core
is that you can just plop it into any namespace without
having to worry about any conflicts with clojure.core
functionality. So, both the core functions
and the added/enhanced functions are both available for use at all times. As such, we
normally refer tupelo.core into our namespace as follows:
(ns my.proj
(:use tupelo.core)
(:require
[clojure.string :as str]
... ))
Expression Debugging
Have you ever been debugging some code and had trouble printing out intermediate values? For example:
(-> 1
(inc) ; want to print value in pipeline after "(inc)" expression
(* 2))
4
Suppose you want to display the value after the (inc)
function. You can’t just insert a
(println …)
because the return value of nil
will break the pipeline structure. Instead,
just use spy
:
(-> 1
(inc)
(spy) ; print value at this location in pipeline
(* 2))
; spy => 2 ; output from spy
4 ; return value from the threading pipeline
This tool is named spy
since it can display values from inside any threading form without
affecting the result of the expression. In this case, spy
printed the value 2
resulting from
the (inc)
expression. Then, the value 2
continued to flow through the following expressions in
the pipeline so that the return value of the expression is unchanged.
You can add in a keyword message to label each spy
output:
(-> 1
(inc)
(spy :after-inc) ; add a custom keyword message
(* 2))
; :after-inc => 2 ; spy output is labeled with keyword message
4 ; return value is unchanged
Note that spy
works equally well inside either a "thread-first" or a "thread-last" form
(e.g. using ->
or ->>
), without requiring any changes.
(->> 1
(inc)
(spy :after-inc) ; spy works equally with both -> and ->> forms
(* 2))
; :after-inc => 2
4
How does spy
accomplish this trick? The answer is that the keyword message is assumed to be the
label, since interesting debug values are more likely to be strings, numbers, or collections like
vectors & maps (if both args are keywords, an exception is thrown; use some other technique for
debugging this use-case). Thus, spy
can detect whether it is in a thread-first or thread-last
form, and then label the output correctly. A side benefit is that keywords like :after-inc
or
just :110
are easy to grep for in output log files.
As a bonus for debugging, the value is output using (pr-str …) so that numbers and strings are unambiguous in the output:
(-> 30
(+ 4)
(spy :dbg)
(* 10))
; :dbg => 34 ; integer result = 34
340
(-> "3"
(str "4")
(spy :dbg)
(str "0"))
; :dbg => "34" ; string result = "34"
"340"
Sometimes you may prefer to print out the literal expression instead of a
keyword label. In this case, just use spyx
(short for "spy expression") :
(it-> 1 ; tupelo.core/it->
(spyx (inc it))
(* 2 it))
; (inc it) => 2 ; the expression is used as the label
4
In other instances, you may wish to use spyxx
to display the expression, its
type, and its value:
(defn mystery-fn [] (into (sorted-map) {:b 2 :a 1}))
(spyxx (mystery-fn))
; (mystery-fn) => <#clojure.lang.PersistentTreeMap {:a 1, :b 2}>"
Non-pure functions (i.e. those with side-effects) are safe to use with spy
.
Any expression supplied to spy will be evaluated only once.
Sometimes you may just want to save some repetition for a simple printout:
(def answer 42)
(spyx answer)
; answer => 42
To be precise, the function signatures for the spy
family are:
(spy <expr>) ; print value of <expr> w/o custom message string
(spy <expr> :kw-label) ; works with ->
(spy :kw-label <expr>) ; works with ->>
(spyx <expr>) ; prints <expr> and its value
(spyxx <expr>) ; prints <expr>, its type, and its value
If you are debugging a series of nested function calls, it can often be handy to indent the spy
output to help in visualizing the call sequence. Using with-spy-indent
will give you just what you
want:
(doseq [x [:a :b]]
(spyx x)
(with-spy-indent
(doseq [y (range 3)]
(spyx y))))
x => :a
y => 0
y => 1
y => 2
x => :b
y => 0
y => 1
y => 2
Literate Threading Macro
We all love to use the threading macros ->
and ->>
for certain tasks, but they only work if
all of the forms should be threaded into the first or last argument.
The built-in threading macro as->
can avoid this problem, but the order of the first expression
and the placeholder symbol is arguably backwards from what most users would expect. Also, there is
often no obvious name to use for the placeholder symbol. Re-using a good idea from Groovy (also
copied by Kotlin), we simply use the symbol it
as the placeholder symbol in each expression
to represent the value of the previous result.
(it-> 1
(inc it) ; thread-first or thread-last
(+ it 3) ; thread-first
(/ 10 it) ; thread-last
(str "We need to order " it " items." ) ; middle of 3 arguments
;=> "We need to order 2 items." )
Here is a more complicated example. Note that we can assign into a local let
block from the it
placeholder value:
(it-> 3
(spy :initial it)
(let [x it]
(inc x))
(spy it :222)
(* it 2)
(spyx it))
; :initial => 3
; :222 => 4
; it => 8
8 ; return value
More examples can be found here.
The it->
macro has a cousin cond-it->
that allows you to thread the updated value through both the conditional and the action
expressions:
(let [params {:a 1 :b 1 :c nil :d nil}]
(cond-it-> params
(:a it) (update it :b inc)
(= (:b it) 2) (assoc it :c "here")
(:c it) (assoc it :d "again")))
;=> {:a 1, :b 2, :c "here", :d "again"}
Map Value Lookup
Maps are convenient, especially when keywords are used as functions to look up a value in
a map. Unfortunately, attempting to look up a non-existent keyword in a map will return
nil
. While sometimes convenient, this means that a simple typo in the keyword name will
silently return corrupted data (i.e. nil
) instead of the desired value.
Instead, use the function grab
for keyword/map lookup:
(grab k m)
"A fail-fast version of keyword/map lookup. When invoked as (grab :the-key the-map),
returns the value associated with :the-key as for (clojure.core/get the-map :the-key).
Throws an Exception if :the-key is not present in the-map."
(def sidekicks {:batman "robin" :clark "lois"})
(grab :batman sidekicks)
;=> "robin"
(grab :spiderman sidekicks)
;=> IllegalArgumentException Key not present in map:
map : {:batman "robin", :clark "lois"}
keys: [:spiderman]
The function grab
should also be used in place of clojure.core/get
. Simply reverse the order of arguments to
match the "keyword-first, map-second" convention.
For looking up values in nested maps, the function fetch-in
replaces clojure.core/get-in
:
(fetch-in m ks)
"A fail-fast version of clojure.core/get-in. When invoked as (fetch-in the-map keys-vec),
returns the value associated with keys-vec as for (clojure.core/get-in the-map keys-vec).
Throws an Exception if the path keys-vec is not present in the-map."
(def my-map {:a 1 :b {:c 3}})
(fetch-in my-map [:b :c])
3
(fetch-in my-map [:b :z])
;=> IllegalArgumentException Key seq not present in map:
;=> map : {:b {:c 3}, :a 1}
;=> keys: [:b :z]
Map Dissociation
Clojure has functions assoc
& assoc-in
, update
& update-in
, and dissoc
. However, there
is no function dissoc-in
. The Tupelo function dissoc-in
provides the desired functionality:
(dissoc-in the-map keys-vec)
"A sane version of dissoc-in that will not delete intermediate keys.
When invoked as (dissoc-in the-map [:k1 :k2 :k3... :kZ]), acts like
(clojure.core/update-in the-map [:k1 :k2 :k3...] dissoc :kZ). That is, only
the map entry containing the last key :kZ is removed, and all map entries
higher than kZ in the hierarchy are unaffected."
The unit test shows the functions in action:
(let [my-map {:a { :b { :c "c" }}} ]
(is (= (dissoc-in my-map [] ) my-map ))
(is (= (dissoc-in my-map [:a ] ) {} ))
(is (= (dissoc-in my-map [:a :b ] ) {:a {}} ))
(is (= (dissoc-in my-map [:a :b :c] ) {:a { :b {}}} ))
(is (= (dissoc-in my-map [:a :x :y] ) {:a { :b { :c "c" }
:x nil }} )))
Note that if non-existant keys are included in keys-vec
, any missing map
layers will be constructed as necessary, which is consistant with the behavior
of both clojure.core/assoc-in
and clojure.core/update-in
(note that nil
is
the value of the final map entry, not the empty map {}
as for the other examples).
Note that only the map entry corresponding to the last key kZ
is cleared. This
differs from the dissoc-in
function in the old clojure-contrib library which
had the unpredictable behavior of recursively (& silently) deleting all keys in
keys-vec
corresponding to empty maps.
Gluing Together Like Collections
The concat
function can sometimes have rather surprising results:
(concat {:a 1} {:b 2} {:c 3} )
;=> ( [:a 1] [:b 2] [:c 3] )
In this example, the user probably meant to merge the 3 maps into one. Instead, the three maps were mysteriously converted into length-2 vectors, which were then nested inside another sequence.
The conj
function can also surprise the user:
(conj [1 2] [3 4] )
;=> [1 2 [3 4] ]
Here the user probably wanted to get [1 2 3 4]
back, but instead got a nested
vector by mistake.
Instead of having to wonder if the items to be combined will be merged, nested, or
converted into another data type, we provide the glue
function to always
combine like collections together into a result collection of the same type:
; Glue together like collections:
(is (= (glue [ 1 2] '(3 4) [ 5 6] ) [ 1 2 3 4 5 6 ] )) ; all sequential (vectors & lists)
(is (= (glue {:a 1} {:b 2} {:c 3} ) {:a 1 :c 3 :b 2} )) ; all maps
(is (= (glue #{1 2} #{3 4} #{6 5} ) #{ 1 2 6 5 3 4 } )) ; all sets
(is (= (glue "I" " like " \a " nap!" ) "I like a nap!" )) ; all text (strings & chars)
; If you want to convert to a sorted set or map, just put an empty one first:
(is (= (glue (sorted-map) {:a 1} {:b 2} {:c 3}) {:a 1 :b 2 :c 3} ))
(is (= (glue (sorted-set) #{1 2} #{3 4} #{6 5}) #{ 1 2 3 4 5 6 } ))
An Exception
will be thrown if the collections to be 'glued' are not all of
the same type. The allowable input types are:
-
all sequential: any mix of lists & vectors (vector result)
-
all maps (sorted or not)
-
all sets (sorted or not)
-
all text: any mix of strings & characters (string result)
Adding Values to the Beginning or End of a Sequence
Clojure has the cons
, conj
, and concat
functions, but it is not obvious how they should be
used to add a new value to the beginning of a vector or list:
; Add to the end
> (concat [1 2] 3) ;=> IllegalArgumentException
> (cons [1 2] 3) ;=> IllegalArgumentException
> (conj [1 2] 3) ;=> [1 2 3]
> (conj [1 2] 3 4) ;=> [1 2 3 4]
> (conj '(1 2) 3) ;=> (3 1 2) ; oops
> (conj '(1 2) 3 4) ;=> (4 3 1 2) ; oops
; Add to the beginning
> (conj 1 [2 3] ) ;=> ClassCastException
> (concat 1 [2 3] ) ;=> IllegalArgumentException
> (cons 1 [2 3] ) ;=> (1 2 3)
> (cons 1 2 [3 4] ) ;=> ArityException
> (cons 1 '(2 3) ) ;=> (1 2 3)
> (cons 1 2 '(3 4) ) ;=> ArityException
Do you know what conj
does when you pass it nil
instead of a sequence? It silently replaces it
with an empty list: (conj nil 5)
⇒ (5)
This can cause you to accumulate items in reverse
order if you aren’t aware of the default behavior:
(-> nil
(conj 1)
(conj 2)
(conj 3))
;=> (3 2 1)
These failures are irritating and unproductive, and the error messages don’t make it obvious what
went wrong. Instead, use the simple prepend
and append
functions to add new elements to the
beginning or end of a sequence, respectively:
(append [1 2] 3 ) ;=> [1 2 3 ]
(append [1 2] 3 4) ;=> [1 2 3 4]
(prepend 3 [2 1]) ;=> [ 3 2 1]
(prepend 4 3 [2 1]) ;=> [4 3 2 1]
Both prepend
and append
always return a vector result.
Combining Scalars and Vectors
Suppose we have a mixture of scalars & vectors (or lists) that we want to combine into a single
vector. We want a function ???
to give us the following result:
(??? 1 2 3 [4 5 6] 7 8 9) => [1 2 3 4 5 6 7 8 9]
Clojure doesn’t have a function for this. Instead we need to wrap all of the scalars into vectors
and then use glue
or concat
:
; can wrap individually or in groups
(glue [1 2 3] [4 5 6] [7 8 9]) => [1 2 3 4 5 6 7 8 9] ; could also use concat
(glue [1] [2] [3] [4 5 6] [7] [8] [9]) => [1 2 3 4 5 6 7 8 9] ; could also use concat
It may be inconvenient to always wrap the scalar values into vectors just to combine them with an
occasional vector value. Instead, it might be more convenient to unwrap the vector values,
then combine the result with other scalars. We can do that with the ->vector
and unwrap
functions:
(->vector 1 2 3 4 5 6 7 8 9) => [1 2 3 4 5 6 7 8 9]
(->vector 1 (unwrap [2 3 4 5 6 7 8]) 9) => [1 2 3 4 5 6 7 8 9]
It will also work recursively for nested unwrap
calls:
(->vector 1 (unwrap [2 3 (unwrap [4 5 6]) 7 8]) 9) => [1 2 3 4 5 6 7 8 9]
Removing Values from a Sequence
Suppose you want to remove an element form a sequence.
Did you know that Clojure has no equivalent to Java’s List.remove(int index)
function? Well, now it does:
(s/defn drop-at :- ts/List
"Removes an element from a collection at the specified index."
[coll :- ts/List
index :- s/Int]
...)
(is (= [ 1 2] (drop-at (range 3) 0)))
(is (= [0 2] (drop-at (range 3) 1)))
(is (= [0 1 ] (drop-at (range 3) 2)))
Unlike the raw take
and drop
functions on which it is based, drop-at
will throw an exception
for invalid values of index
.
Inserting Values into a Sequence
Suppose you want to insert an element into a sequence. Tupelo has you covered here as well:
(s/defn insert-at :- ts/List
"Inserts an element into a collection at the specified index."
[coll :- ts/List
index :- s/Int
elem :- s/Any]
...)
(is (= [9 0 1] (insert-at [0 1] 0 9)))
(is (= [0 9 1] (insert-at [0 1] 1 9)))
(is (= [0 1 9] (insert-at [0 1] 2 9)))
As with assoc
, you are allowed to insert the new element into the first empty slot after all
existing elements, but no further. insert-at
will throw an exception for invalid values of index
.
Replacing Values in a Sequence
And, of course, you can also replace an element in a sequence:
(s/defn replace-at :- ts/List
"Replaces an element in a collection at the specified index."
[coll :- ts/List
index :- s/Int
elem :- s/Any]
...)
(is (= [9 1 2] (replace-at (range 3) 0 9)))
(is (= [0 9 2] (replace-at (range 3) 1 9)))
(is (= [0 1 9] (replace-at (range 3) 2 9)))
As with drop-at
, replace-at
will throw an exception for invalid values of index
.
Convenience in Testing Seq’s
Clojure has an empty?
function to indicate if a collection has zero elements or is nil
(i.e. not
present). However, clojure has no corresponding not-empty?
function, and people have written into
the mailing wondering where it is. Well, now it is available:
(not-empty? coll)
"For any collection, returns true if coll contains any items;
otherwise returns false. Equivalent to (not (empty? coll))."
The unit test shows it in action:
(is (= (map not-empty? ["1" [1] '(1) {:1 1} #{1} ] )
[true true true true true ] ))
(is (= (map not-empty? ["" [] '() {} #{} nil ] )
[false false false false false false ] ))
(is (= (keep-if not-empty? ["1" [1] '(1) {:1 1} #{1} ] )
["1" [1] '(1) {:1 1} #{1} ] ))
(is (= (drop-if not-empty? ["" [] '() {} #{} nil] )
["" [] '() {} #{} nil] ))
Just to confuse things, Clojure does have the similarly named functions empty
and not-empty
.
Be sure to avoid these two functions for predicate tests.
A similar, but more complicated, situation exists in the case of not-any?
.
Clojure has the not-any?
function to indicate if a predicate is false for all items
in a collection. However, there has never been a corresponding any?
function such that
(= (not-any? pred coll)
(not (any? pred coll)))
for any predicate and collection. The situation has become more confusion as of Clojure
1.9.0-alpha10 since a completely unrelated function any?
has been added in support of
clojure.spec
. The new any?
function is defined as:
(defn any?
"Returns true given any argument."
[x] true)
So the new any?
function is a semantic mismatch to the not-any?
function and
is completely unrelated to testing a collection using a predicate.
The Tupelo library attempts to resolve this confusing situation by providing both positive and
negative versions of the collection test with a name which does not conflict with either
any?
or not-any?
in clojure.core
:
(has-some? pred coll)
"For any predicate pred & collection coll, returns true if (pred x) is logical true for at least one x in
coll; otherwise returns false. Like clojure.core/some, but returns only true or false."
(has-none? pred coll)
"For any predicate pred & collection coll, returns false if (pred x) is logical true for at least one x in
coll; otherwise returns true. Equivalent to clojure.core/not-any?, and is the inverse of has-some?."
The unit test shows these functions in action:
(is (= true (has-some? odd? [1 2 3] ) ))
(is (= false (has-some? odd? [2 4 6] ) ))
(is (= false (has-some? odd? [] ) ))
(is (= false (has-none? odd? [1 2 3] ) ))
(is (= true (has-none? odd? [2 4 6] ) ))
(is (= true (has-none? odd? [] ) ))
Searching for entries in Collections, Maps, and Sets
Sometimes we want an easy way to find out if an item is n a collection. The Tupelo library supplies
three convenient functions for this purpose: contains-elem?
, contains-key?
, and contains-val?
.
The most generic function is contains-elem?
, which is intended for vectors or any other clojure seq
:
(testing "vecs"
(let [coll (range 3)]
(isnt (contains-elem? coll -1))
(is (contains-elem? coll 0))
(is (contains-elem? coll 1))
(is (contains-elem? coll 2))
(isnt (contains-elem? coll 3))
(isnt (contains-elem? coll nil)))
(let [coll [ 1 :two "three" \4]]
(isnt (contains-elem? coll :no-way))
(isnt (contains-elem? coll nil))
(is (contains-elem? coll 1))
(is (contains-elem? coll :two))
(is (contains-elem? coll "three"))
(is (contains-elem? coll \4)))
(let [coll [:yes nil 3]]
(isnt (contains-elem? coll :no-way))
(is (contains-elem? coll :yes))
(is (contains-elem? coll nil))))
Here we see that for an integer range or a mixed vector, contains-elem?
works as expected for both
existing and non-existant elements in the collection. For maps, we can also search for any
key-value pair (expressed as a len-2 vector):
(testing "maps"
(let [coll {1 :two "three" \4}]
(isnt (contains-elem? coll nil ))
(isnt (contains-elem? coll [1 :no-way] ))
(is (contains-elem? coll [1 :two]))
(is (contains-elem? coll ["three" \4])))
(let [coll {1 nil "three" \4}]
(isnt (contains-elem? coll [nil 1] ))
(is (contains-elem? coll [1 nil] )))
(let [coll {nil 2 "three" \4}]
(isnt (contains-elem? coll [1 nil] ))
(is (contains-elem? coll [nil 2] ))))
It is also straightforward to search a set:
(testing "sets"
(let [coll #{1 :two "three" \4}]
(isnt (contains-elem? coll :no-way))
(is (contains-elem? coll 1))
(is (contains-elem? coll :two))
(is (contains-elem? coll "three"))
(is (contains-elem? coll \4)))
(let [coll #{:yes nil}]
(isnt (contains-elem? coll :no-way))
(is (contains-elem? coll :yes))
(is (contains-elem? coll nil)))))
For maps & sets, it is simpler (& more efficient) to use contains-key?
to find a map entry or a
set element:
(deftest t-contains-key?
(is (contains-key? {:a 1 :b 2} :a))
(is (contains-key? {:a 1 :b 2} :b))
(isnt (contains-key? {:a 1 :b 2} :x))
(isnt (contains-key? {:a 1 :b 2} :c))
(isnt (contains-key? {:a 1 :b 2} 1))
(isnt (contains-key? {:a 1 :b 2} 2))
(is (contains-key? {:a 1 nil 2} nil))
(isnt (contains-key? {:a 1 :b nil} nil))
(isnt (contains-key? {:a 1 :b 2} nil))
(is (contains-key? #{:a 1 :b 2} :a))
(is (contains-key? #{:a 1 :b 2} :b))
(is (contains-key? #{:a 1 :b 2} 1))
(is (contains-key? #{:a 1 :b 2} 2))
(isnt (contains-key? #{:a 1 :b 2} :x))
(isnt (contains-key? #{:a 1 :b 2} :c))
(is (contains-key? #{:a 5 nil "hello"} nil))
(isnt (contains-key? #{:a 5 :doh! "hello"} nil))
(throws? (contains-key? [:a 1 :b 2] :a))
(throws? (contains-key? [:a 1 :b 2] 1)))
And, for maps, you can also search for values with contains-val?
:
(deftest t-contains-val?
(is (contains-val? {:a 1 :b 2} 1))
(is (contains-val? {:a 1 :b 2} 2))
(isnt (contains-val? {:a 1 :b 2} 0))
(isnt (contains-val? {:a 1 :b 2} 3))
(isnt (contains-val? {:a 1 :b 2} :a))
(isnt (contains-val? {:a 1 :b 2} :b))
(is (contains-val? {:a 1 :b nil} nil))
(isnt (contains-val? {:a 1 nil 2} nil))
(isnt (contains-val? {:a 1 :b 2} nil))
(throws? (contains-val? [:a 1 :b 2] 1))
(throws? (contains-val? #{:a 1 :b 2} 1)))
As seen in the test, each of these functions works correctly when for searching for nil
values.
Focus on Vectors
Clojure’s seq abstraction (and lazy seq’s) is very useful, but sometimes you just want everything to
stay in a nice, eager, random-access vector. Here is an eager (non-lazy) version of for
which
always returns results in a vector:
(is= (forv [x (range 4)] (* x x))
[0 1 4 9] )
Simplified Lazy Sequence Generation
Clojure training materials seem to vary somewhat in the recommended form for the generation of a lazy sequence. This
is further complicated by the legacy function lazy-cat
which can easily cause an out-of-memory error
(please see this post).
A simpler form is possible using tupelo.core/lazy-cons
macro. An example
of this form in use is:
(defn lazy-countdown [n]
(when (<= 0 n)
(lazy-cons n (lazy-countdown (dec n)))))
(deftest t-all
(is= (lazy-countdown 5) [5 4 3 2 1 0] )
(is= (lazy-countdown 1) [1 0] )
(is= (lazy-countdown 0) [0] )
(is= (lazy-countdown -1) nil ))
The new macro lazy-cons
accepts the output value as the first arg, and a recursive function call
as the second arg. The recursive call will have delayed-execution and will not be invoked until it is required.
The (when <condition>)
form returns nil
to signal the termination of the lazy sequence.
Implementation note:
The canonical structure of when
and lazy-cons
shown above is not required, but is probably the simplest of multiple
possible choices. The new form of (lazy-cons val (recursive-call…))
is nothing but a simplification
of the original clojure.core
form (lazy-seq (cons val (recursive-call…)))
which reduces typing and
the possibility of errors.
Please note that tupelo.core/lazy-cons
bears no relation to the historical lazy-cons
which was
briefly considered for clojure.core
circa 2008.
Generator Functions for Lazy Sequences (a la Python)
One of the nice features of Python is the ability to use Generator Functions. These allow a function to "yield" a result from anywhere in the code, which is placed in a lazy output buffer for consumption by the calling function. The generator function is "paused" until the output value is consumed, then resumes execution where it left off with all local state preserved. This ability is especially handy when you have nested loops or other structures that make it inconvenient to return a result as the last expression in a function.
(defn concat-gen ; concat a list of collections
[& collections]
(lazy-gen
(doseq [curr-coll collections]
(doseq [item curr-coll]
(yield item)))))
(defn concat-gen-pair
[& collections]
(lazy-gen
(doseq [curr-coll collections]
(doseq [item curr-coll]
(yield-all [item item])))))
(def c1 [1 2 3])
(def c2 [4 5 6])
(def c3 [7 8 9])
(is= [1 2 3 4 5 6 7 8 9] (concat-gen c1 c2 c3))
(is= [1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9] (concat-gen-pair c1 c2 c3))
lazy-gen
uses a core.async
channel to buffer output, with a default buffer size of 32 (controlled by
the dynamic var lazy-gen-buffer-size
). Result values passed to yield
generate a lazy sequence that is the
result of the (lazy-gen …) macro. The closely-related function yield-all
inserts the elements of a collection
onto the output stream instead of just a single value. Besides doseq
, lazy-gen
is also very handy for
generating a lazy seq within a loop
-recur
expression.
Validating Intermediate Results
Within a processing chain, it is often desirable to verify that an intermediate value is
within an expected range or of an expected type. The built-in assert
function cannot be
used for this purpose since it returns nil
, and the Plumatic Schema validate
can only
perform a limited amount of type testing. The (validate …)
function performs
arbitrary validation, throwing an exception if a non-truthy result is returned:
(validate tstfn tstval)
"Used to validate intermediate results. Returns tstval if the result of
(tstfn tstval) is truthy. Otherwise, throws IllegalStateException."
(is (= 3 (validate pos? 3 )))
(is (= 3.14 (validate number? 3.14 )))
(is (= 3.14 (validate #(< 3 % 4) 3.14 )))
A closely related function is verify
. It is like validate but accepts an expression instead of a
predicate/value pair. Upon success, the expression value is returned; otherwise an exception is thrown:
(throws? (verify (= 1 2)))
(is= 333 (verify (* 3 111))))
Convenient Wild-Card Matches
Sometimes in testing, we want to verify that a key-value pair is present in a map, but we
don’t know or care what the value is. For example, Datomic returns maps containing the key
:db/id
, but the associated value is unpredictable. Tupelo provides the (matches? …)
expression to make these tests a snap:
(matches? pattern & values)
(matches? { :a 1 :b _ }
{ :a 1 :b 99 }
{ :a 1 :b [1 2 3] }
{ :a 1 :b nil } ) ;=> true
(matches? [1 _ 3] [1 2 3] ) ;=> true
Note that a wildcard can match either a primitive or a composite value. It works for both maps
and vectors. The only restriction is that the wildcard symbol _
(underscore) cannot be used as
a key in the pattern-map (it can be used anywhere in a vector-pattern)."
Fast & Simple Wild-Card Matches
Sometimes using core.match
is overkill. For some patterns & values it can run very slowly or even
create a stack overflow exception. For most cases, all you really need is a simple wildcard match.
The wild-match?
function returns true
if a pattern is matched by one or more values. The special
keyword :*
(colon-star) in the pattern serves as a wildcard value. Note that a wildcard can match
either a primitive or a composite value: Usage:
(wild-match? pattern & values)
Samples:
(wild-match? {:a :* :b 2}
{:a 1 :b 2}) ;=> true
(wild-match? [1 :* 3]
[1 2 3]
[1 9 3] )) ;=> true
(wild-match? {:a :* :b 2}
{:a [1 2 3] :b 2}) ;=> true
Map Entries (Key-Value pairs)
Sometimes you want to extract the keys & values from a map for manipulation or extension before building up another map (especially useful for manipulating default function args). Here is very handy function for that:
(keyvals m)
"For any map m, returns the keys & values of m as a vector,
suitable for reconstructing via (apply hash-map (keyvals m))."
(keyvals {:a 1 :b 2})
;=> [:b 2 :a 1]
(apply hash-map (keyvals {:a 1 :b 2}))
;=> {:b 2, :a 1}
Default Value in Case of Exception
Sometimes you know an operation may result in an Exception, and you would like to have the Exception converted into a default value. That is when you need:
(with-exception-default default-val & body)
"Evaluates body & returns its result. In the event of an exception the
specified default value is returned instead of the exception."
(with-exception-default 0
(Long/parseLong "12xy3"))
;=> 0
This feature is put to good use in tupelo.parse, where you will find functions that work like this:
(parse-long "123") ; throws if parse error
;=> 123
(parse-long "1xy23" :default 666) ; returns default val if parse error
;=> 666
Floating Point Number Comparison
Everyone knows that you shouldn’t compare floating-point numbers (e.g. float, double, etc) for equality since roundoff errors can prevent a precise match between logically equivalent results. However, it has always been awkward to regenerate "approx-equals" code by hand every time new project requires it. Here we have a simple function that compares two floating-point values (cast to double) for relative equality by specifying either the number of significant digits that must match or the maximum error tolerance allowed:
(rel= val1 val2 & opts)
"Returns true if 2 double-precision numbers are relatively equal, else false.
Relative equality is specified as either (1) the N most significant digits are
equal, or (2) the absolute difference is less than a tolerance value. Input
values are coerced to double before comparison."
An extract from the unit tests illustrates the use of rel=
(is (rel= 123450000 123456789 :digits 4 )) ; .12345 * 10^9
(is (not (rel= 123450000 123456789 :digits 6 )))
(is (rel= 0.123450000 0.123456789 :digits 4 )) ; .12345 * 1
(is (not (rel= 0.123450000 0.123456789 :digits 6 )))
(is (rel= 1 1.001 :tol 0.01 )) ; :tol value is absolute error
(is (not (rel= 1 1.001 :tol 0.0001 )))
Note that, for the :digits variant, 'equality' is truly relative, since only the N most significant digits of each value must match.
String Operations
Be sure to see the dedicated functions in the tupelo.string namespace!
Suppose you have a bunch of nested results and you just want to convert everything into a single
string. In that case, strcat
is for you:
(is (= (strcat "I " [ \h \a nil \v [\e \space (byte-array [97])
[ nil 32 "complicated" (Math/pow 2 5) '( "str" nil "ing") ]]] )
"I have a complicated string" ))
Note that any nil
values map to the empty string as with clojure.core/str
.
Sometimes, you may wish to clip a string to a maximum length for ease of display. In that case, use clip-str
:
(is (= "abc" (clip-str 3 "abcdefg")))
(is (= "{:a 1, :" (clip-str 8 (sorted-map :a 1 :b 2) )))
(is (= "{:a 1, :b 2}" (clip-str 99 (sorted-map :a 1 :b 2) )))
Notice that clip-str will accept any argument type (map, sequence, etc), and convert it into a string for you. Also, it will work correctly even if the clip-length is an upper bound; shorter strings are returned unchanged.
Keeping & Dropping Elements of a Sequence
When processing sequences of data, we often need to extract a sequence of desired data, or, conversely, remove all of the undesired elements. Have you ever been left wondering which of these two forms is correct?
(let [result (filter even? (range 10)) ]
(assert (or (= result [ 1 3 5 7 9 ] ) ; is it "remove bad" (falsey)
(= result [ 0 2 4 6 8 ] )))) ; or "keep good" (truthy) ???
I normally think of filters as removing bad things. Air filters remove dust. Coffee filters keep
coffee grounds out of my cup. A noise filter in my stereo removes contaminating frequencies from my
music. However, filter
in Clojure is written in reverse, so that it keeps items identified by
the predicate. Wouldn’t be nicer (and much less ambiguous) if you could just write the following?
(is (= [0 2 4 6 8] (keep-if even? (range 10))
(drop-if odd? (range 10))))
It seems to me that keep-if
and drop-if
are much more natural names and remove ambiguity from
the code. Of course, these are just thin wrappers around the built-in clojure.core
functions, but they are much less ambiguous. I think they make the code easier to read and the
intent more obvious.
Keeping & Dropping Elements from a Map or Set
The two functions keep-if
and drop-if
can be used equally well in order to retain or remove
elements from a clojure map or set. The semantics for sets look the same as for a sequence (vector
or list). The predicate can be any 1-arg function:
(keep-if even? #{1 2 3 4 5} )
;=> #{4 2}
(drop-if even? #{1 2 3 4 5} )
;=> #{1 3 5}
Notice that the functions recognized the input collection as a set, and returned a set as the result. Very convenient.
For maps, each element is a MapEntry, which contains both a key and value. keep-if
and drop-if
understand maps, and will destructure each MapEntry. Thus, the predicate function can be any 2-arg
function:
(def mm {10 0, 20 0
11 1, 21 1
12 2, 22 2
13 3, 23 3} )
(is (= (keep-if (fn [k v] (odd? v)) mm)
(drop-if (fn [k v] (even? v)) mm)
{11 1, 21 1
13 3, 23 3} ))
(is (= (keep-if (fn [k v] (< k 19)) mm)
(drop-if (fn [k v] (> k 19)) mm)
{10 0
11 1
12 2
13 3} ))
As with sets, the functions recognized that a map was supplied, accepted a 2-arg predicate function, and returned back a map to the user.
Both keep-if
and drop-if
will throw an Exception if the predicate function supplied has the
wrong arity, or if the supplied collection is not one of either the sequential (vector or list),
map, or set data types.
Extracting Only Values
The pervasive use of seq’s in Clojure means that scalar values often appear wrapped in a vector or
some other sequence type. As a result, one often sees code like (first some-var)
and it is not
always clear that the code is simply "unwrapping" a scalar value, since there could well be
remaining values in the sequence. Indeed, for a length-1 sequence it would be equally valid
to use (last some-var)
since first=last if there is only one item in the list.
To clarify that we are simply unwrapping a single value from
the sequence, we may use the function only
:
(only seq-arg)
"Ensures that a sequence is of length=1, and returns the only value present.
Throws an exception if the length of the sequence is not one. Note that,
for a length-1 sequence S, (first S), (last S) and (only S) are equivalent."
Getting Past Second Base
Clojure has the functions first
, second
, and requires the use of nth
for any subsequent
position. Sometimes it is handy to have a quick way to grab the 3rd item from a sequential
collection. Tupelo provides the third
function to fill this void:
(is= nil (third [ ]))
(is= nil (third [1 ]))
(is= nil (third [1 2 ]))
(is= 3 (third [1 2 3 ]))
(is= 3 (third [1 2 3 4]))
The Truth Is Not Ambiguous
Clojure marries the worlds of Java and Lisp. Unfortunately, these two worlds have different ideas of
truth, so Clojure accepts both false
and nil
as false. Sometimes, however, you want to coerce
logical values into literal true or false values, so we provide a simple way to do that:
(truthy? arg)
"Returns true if arg is logical true (neither nil nor false);
otherwise returns false."
(falsey? arg)
"Returns true if arg is logical false (either nil or false);
otherwise returns false. Equivalent to (not (truthy? arg))."
Since truthy?
and falsey?
are functions (instead of special forms or
macros), we can use them as an argument to filter
or any other place that a
higher-order-function is required:
(def data [true :a 'my-symbol 1 "hello" \x false nil])
(filter truthy? data)
;=> [true :a my-symbol 1 "hello" \x]
(filter falsey? data)
;=> [false nil]
(is (every? truthy? [true :a 'my-symbol 1 "hello" \x] ))
(is (every? falsey? [false nil] ))
(let [count-if (comp count keep-if) ]
(let [num-true (count-if truthy? data) ; <= better than (count-if boolean data)
num-false (count-if falsey? data) ] ; <= better than (count-if not data)
(is (and (= 6 num-true)
(= 2 num-false) )))))
not-nil?
Keeping It Simple with Clojure has the build-in function some
to return the first truthy value from a sequence
argument. It also has the poorly named function some?
which returns the value true
if a
scalar argument satisfies (not (nil? arg))
. It is easy to confuse some
and some?
, not only
in their return type but also in the argument they accept (sequence or scalar). In keeping with the
style for other basic test functions, we provide the function not-nil?
as the opposite of nil?
.
The unit tests show how not-nil?
leads to a more natural code syntax:
(let [data [true :a 'my-symbol 1 "hello" \x false nil] ]
(let [notties (keep-if not-nil? data)
nillies (drop-if not-nil? data) ]
(is (and (= notties [true :a 'my-symbol 1 "hello" \x false] )
(= nillies [nil] )))
(is (every? not-nil? notties)) ; the 'not' can be used
(is (not-any? nil? notties))) ; in either first or 2nd positon
(let [count-if (comp count keep-if) ]
(let [num-valid-1 (count-if some? data) ; awkward phrasing, doesn't feel natural
num-valid-2 (count-if not-nil? data) ; matches intent much better
num-nil (count-if nil? data) ] ; intent is plain
(is (and (= 7 num-valid-1 num-valid-2 )
(= 1 num-nil))))))
Identifying Sequences
Update 2016-6-13: Now included in clojure.core 1.9.0-alpha5!
In some situations, a function may need to verify that an argument is seqable, that is, will a
call to (seq some-arg)
succeed? If so, some-arg
may be interpreted as a sequence of values.
Clojure doesn’t have a built-in function for this (please note that seqable?
is different from
seq?
), but we can copy an solution from the old clojure.contrib.core/seqable
:
(is (seqable? "abc"))
(is (seqable? {1 2 3 4} ))
(is (seqable? #{1 2 3} ))
(is (seqable? '(1 2 3) ))
(is (seqable? [1 2 3] ))
(is (seqable? (byte-array [1 2] )))
(is (not (seqable? 1 )))
(is (not (seqable? \a )))
Change Log
Please see the the ChangeLog for details docs.
Other useful libraries
There are several other libraries that provide useful value-added functionality to clojure.core:
-
The Clojure Toolbox - For a comprehehsive list of Clojure libraries
Requirements
-
Clojure 1.8.0
-
Java 1.8
License
Copyright © 2015-2017 Alan Thompson
Distributed under the Eclipse Public License, the same as Clojure.
Development Environment
Developed using IntelliJ IDEA with the Cursive Clojure plugin.
YourKit supports open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.
ToDo List (#todo)
types schema (& schema-datomic) re-work csv kill y64? Update all NS docstrings zipcode distance testing lein plugin make CLJS compatible more docs for other namespaces add more test.check add spy-let, spy-defn, spy-validate, etc blog posts