• Stars
    star
    573
  • Rank 77,865 (Top 2 %)
  • Language
    Clojure
  • Created about 9 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Extra transducers and reducing fns for Clojure(script)

xforms

More transducers and reducing functions for Clojure(script)!

Transducers can be classified in three groups: regular ones, higher-order ones (which accept other transducers as arguments) and aggregators (transducers which emit only 1 item out no matter how many went in). Aggregators generally only make sense in the context of a higher-order transducer.

In net.cgrand.xforms:

  • regular ones: partition (1 arg), reductions, for, take-last, drop-last, sort, sort-by, wrap, window and window-by-time
  • higher-order ones: by-key, into-by-key, multiplex, transjuxt, partition (2+ args), time
  • aggregators: reduce, into, without, transjuxt, last, count, avg, sd, min, minimum, max, maximum, str

In net.cgrand.xforms.io:

  • sh to use any process as a reducible collection (of stdout lines) or as a transducers (input as stdin lines, stdout lines as output).

Reducing functions

  • in net.cgrand.xforms.rfs: min, minimum, max, maximum, str, str!, avg, sd, last and some.
  • in net.cgrand.xforms.io: line-out and edn-out.

(in net.cgrand.xforms)

Transducing contexts:

  • in net.cgrand.xforms: transjuxt (for performing several transductions in a single pass), iterator (clojure only), into, without, count, str (2 args) and some.
  • in net.cgrand.xforms.io: line-out (3+ args) and edn-out (3+ args).
  • in net.cgrand.xforms.nodejs.stream: transformer.

Reducible views (in net.cgrand.xforms.io): lines-in and edn-in.

Note: it should always be safe to update to the latest xforms version; short of bugfixes, breaking changes are avoided.

Add as a dependency

For specific coordinates see the Releases page.

Usage

=> (require '[net.cgrand.xforms :as x])

str and str! are two reducing functions to build Strings and StringBuilders in linear time.

=> (quick-bench (reduce str (range 256)))
             Execution time mean : 58,714946 µs
=> (quick-bench (reduce rf/str (range 256)))
             Execution time mean : 11,609631 µs

for is the transducing cousin of clojure.core/for:

=> (quick-bench (reduce + (for [i (range 128) j (range i)] (* i j))))
             Execution time mean : 514,932029 µs
=> (quick-bench (transduce (x/for [i % j (range i)] (* i j)) + 0 (range 128)))
             Execution time mean : 373,814060 µs

You can also use for like clojure.core/for: (x/for [i (range 128) j (range i)] (* i j)) expands to (eduction (x/for [i % j (range i)] (* i j)) (range 128)).

by-key and reduce are two new transducers. Here is an example usage:

;; reimplementing group-by
(defn my-group-by [kfn coll]
  (into {} (x/by-key kfn (x/reduce conj)) coll))

;; let's go transient!
(defn my-group-by [kfn coll]
  (into {} (x/by-key kfn (x/into [])) coll))

=> (quick-bench (group-by odd? (range 256)))
             Execution time mean : 29,356531 µs
=> (quick-bench (my-group-by odd? (range 256)))
             Execution time mean : 20,604297 µs

Like by-key, partition also takes a transducer as last argument to allow further computation on the partition.

=> (sequence (x/partition 4 (x/reduce +)) (range 16))
(6 22 38 54)

Padding is achieved as usual:

=> (sequence (x/partition 4 4 (repeat :pad) (x/into [])) (range 9))
([0 1 2 3] [4 5 6 7] [8 :pad :pad :pad])

avg is a transducer to compute the arithmetic mean. transjuxt is used to perform several transductions at once.

=> (into {} (x/by-key odd? (x/transjuxt [(x/reduce +) x/avg])) (range 256))
{false [16256 127], true [16384 128]}
=> (into {} (x/by-key odd? (x/transjuxt {:sum (x/reduce +) :mean x/avg :count x/count})) (range 256))
{false {:sum 16256, :mean 127, :count 128}, true {:sum 16384, :mean 128, :count 128}}

window is a new transducer to efficiently compute a windowed accumulator:

;; sum of last 3 items
=> (sequence (x/window 3 + -) (range 16))
(0 1 3 6 9 12 15 18 21 24 27 30 33 36 39 42)

=> (def nums (repeatedly 8 #(rand-int 42)))
#'user/nums
=> nums
(11 8 32 26 6 10 37 24)

;; avg of last 4 items
=> (sequence
     (x/window 4 rf/avg #(rf/avg %1 %2 -1))
     nums)
(11 19/2 17 77/4 18 37/2 79/4 77/4)

;; min of last 3 items
=> (sequence
        (x/window 3
          (fn
            ([] (sorted-map))
            ([m] (key (first m)))
            ([m x] (update m x (fnil inc 0))))
          (fn [m x]
            (let [n (dec (m x))]
              (if (zero? n)
                (dissoc m x)
                (assoc m x (dec n))))))
        nums)
(11 8 8 8 6 6 6 10)

On Partitioning

Both by-key and partition takes a transducer as parameter. This transducer is used to further process each partition.

It's worth noting that all transformed outputs are subsequently interleaved. See:

=> (sequence (x/partition 2 1 identity) (range 8))
(0 1 1 2 2 3 3 4 4 5 5 6 6 7)
=> (sequence (x/by-key odd? identity) (range 8))
([false 0] [true 1] [false 2] [true 3] [false 4] [true 5] [false 6] [true 7])

That's why most of the time the last stage of the sub-transducer will be an aggregator like x/reduce or x/into:

=> (sequence (x/partition 2 1 (x/into [])) (range 8))
([0 1] [1 2] [2 3] [3 4] [4 5] [5 6] [6 7])
=> (sequence (x/by-key odd? (x/into [])) (range 8))
([false [0 2 4 6]] [true [1 3 5 7]])

Simple examples

(group-by kf coll) is (into {} (x/by-key kf (x/into []) coll)).

(plumbing/map-vals f m) is (into {} (x/by-key (map f)) m).

My faithful (reduce-by kf f init coll) is now (into {} (x/by-key kf (x/reduce f init))).

(frequencies coll) is (into {} (x/by-key identity x/count) coll).

On key-value pairs

Clojure reduce-kv is able to reduce key value pairs without allocating vectors or map entries: the key and value are passed as second and third arguments of the reducing function.

Xforms allows a reducing function to advertise its support for key value pairs (3-arg arity) by implementing the KvRfable protocol (in practice using the kvrf macro).

Several xforms transducers and transducing contexts leverage reduce-kv and kvrf. When these functions are used together, pairs can be transformed without being allocated.

fnkvs in?kvs out?
`for`when first binding is a pairwhen `body-expr` is a pair
`reduce`when is `f` is a kvrfno
1-arg `into`
(transducer)
when `to` is a mapno
3-arg `into`
(transducing context)
when `from` is a mapwhen `to` is a map
`by-key`
(as a transducer)
when is `kfn` and `vfn` are unspecified or `nil`when `pair` is `vector` or unspecified
`by-key`
(as a transducing context on values)
nono
;; plain old sequences
=> (let [m (zipmap (range 1e5) (range 1e5))]
     (crit/quick-bench
       (into {}
         (for [[k v] m]
           [k (inc v)]))))
Evaluation count : 12 in 6 samples of 2 calls.
             Execution time mean : 55,150081 ms
    Execution time std-deviation : 1,397185 ms

;; x/for but pairs are allocated (because of into) 
=> (let [m (zipmap (range 1e5) (range 1e5))]
     (crit/quick-bench
       (into {}
         (x/for [[k v] _]
           [k (inc v)])
         m)))
Evaluation count : 18 in 6 samples of 3 calls.
             Execution time mean : 39,119387 ms
    Execution time std-deviation : 1,456902 ms
    
;; x/for but no pairs are allocated (thanks to x/into) 
=> (let [m (zipmap (range 1e5) (range 1e5))]
     (crit/quick-bench (x/into {}
               (x/for [[k v] %]
                 [k (inc v)])
               m)))
Evaluation count : 24 in 6 samples of 4 calls.
             Execution time mean : 24,276790 ms
    Execution time std-deviation : 364,932996 µs

Changelog

0.19.4

  • Fix ClojureScript compilation broken in 0.19.3 #49
  • Fix x/sort and x/sort-by for ClojureScript #40

0.19.3

  • Add deps.edn to enable usage as a git library
  • Bump macrovich to make Clojure and ClojureScript provided dependencies #34
  • Fix reflection warnings in xforms.io #35 #36
  • Add compatibility with babashka #42
  • Fix x/destructuring-pair? #44 #45
  • Fix x/into performance hit with small maps #46 #47
  • Fix reflection and shadowing warnings in tests

0.19.2

  • Fix infinity symbol causing issues with ClojureScript #31

0.19.0

time allows to measure time spent in one transducer (excluding time spent downstream).

=> (time ; good old Clojure time
     (count (into [] (comp
                     (x/time "mapinc" (map inc))
                     (x/time "filterodd" (filter odd?))) (range 1e6))))
filterodd: 61.771738 msecs
mapinc: 143.895317 msecs
"Elapsed time: 438.34291 msecs"
500000

First argument can be a function that gets passed the time (in ms), this allows for example to log time instead of printing it.

0.9.5

  • Short (up to 4) literal collections (or literal collections with :unroll metadata) in collection positions in x/for are unrolled. This means that the collection is not allocated. If it's a collection of pairs (e.g. maps), pairs themselves won't be allocated.

0.9.4

  • Add x/into-by-key short hand

0.7.2

  • Fix transients perf issue in Clojurescript

0.7.1

  • Works with Clojurescript (even self-hosted).

0.7.0

  • Added 2-arg arity to x/count where it acts as a transducing context e.g. (x/count (filter odd?) (range 10))
  • Preserve type hints in x/for (and generally with kvrf).

0.6.0

  • Added x/reductions
  • Now if the first collection expression in x/for is not a placeholder then x/for works like x/for but returns an eduction and performs all iterations using reduce.

Troubleshooting xforms in a Clojurescript dev environment

If you use xforms with Clojurescript and the Emacs editor to start your figwheel REPL be sure to include the cider.nrepl/cider-middleware to your figwheel's nrepl-middleware.

  :figwheel {...
             :nrepl-middleware [cider.nrepl/cider-middleware;;<= that middleware
                                refactor-nrepl.middleware/wrap-refactor
                                cemerick.piggieback/wrap-cljs-repl]
             ...}

Otherwise a strange interaction occurs and every results from your REPL evaluation would be returned as a String. Eg.:

cljs.user> 1
"1"
cljs.user>

instead of:

cljs.user> 1
1
cljs.user>

License

Copyright © 2015-2016 Christophe Grand

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

More Repositories

1

enlive

a selector-based (à la CSS) templating and transformation system for Clojure
Clojure
1,619
star
2

moustache

a micro web framework/internal DSL to wire Ring handlers and middlewares
Clojure
261
star
3

enliven

Enlive next: faster, better, broader
Clojure
243
star
4

seqexp

Regexp for sequences!
Clojure
241
star
5

parsley

a DSL for creating total and truly incremental parsers in Clojure
Clojure
200
star
6

macrovich

A set of three macros to ease writing `*.cljc` supporting Clojure, Clojurescript and self-hosted Clojurescript.
Clojure
163
star
7

sjacket

Structural code transformations for the masses.
Clojure
114
star
8

megaref

STM ref types that allow for more concurrency on associative values.
Clojure
95
star
9

spreadmap

Evil project to turn excel spreadsheets in persistent reactive structures.
Clojure
89
star
10

regex

a regex DSL for those who prefer verbose composable regexes to terse ones
Clojure
88
star
11

poucet

trace as data for Clojure/JVM
Clojure
86
star
12

confluent-map

A persistent confluent map for Clojure
Java
39
star
13

packed-printer

Compact pretty printer
Clojure
37
star
14

sqrel

The SQL library that won't drive you nuts.
Clojure
37
star
15

utils

useful functions and extensible macros
Clojure
36
star
16

indexed-set

A set implementation which supports unicity constraints and maintains summaries (indexes).
Clojure
31
star
17

replay

Instant test suites from repl transcript.
Clojure
26
star
18

cljs-js-repl

Upgradable self hosted clojurescript repl
Clojure
21
star
19

parsnip

parsley is dead, long live parsnip!
Clojure
17
star
20

boring

A tunnel-boring library
Clojure
11
star
21

enlivez

Clojure
8
star
22

dynvars

dynamic bindings in callback hell
Clojure
5
star
23

sandbox

expeiments, works in progress etc.
Clojure
4
star
24

advent2017

Clojure
2
star
25

trainings.geneva.2012

Ressources issues des formations Clojure données du 13 au 15 mai 2012
Clojure
2
star
26

berlin-tron

Clojure
1
star