• Stars
    star
    609
  • Rank 73,614 (Top 2 %)
  • Language
    Clojure
  • License
    MIT License
  • Created over 14 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Read and write Office documents from Clojure

Docjure

Docjure makes reading and writing Office Excel spreadsheet documents in Clojure easy.

Who is this for?

Docjure is aimed at making the basic use case of reading and writing spreadsheets easy.

If you need advanced charting, pivot tables etc., the easiest way is to build template spreadsheets with Excel and populate them with Docjure.

If you want to manipulate advanced features programatically, you are probably better off using the underlying Apache POI spreadsheet library directly or looking for another tool.

Docjure has low churn. It is very stable library with a history going back to 2009 (open-sourced in 2010).

Usage

Example: Read a Price List spreadsheet

(use 'dk.ative.docjure.spreadsheet)

;; Load a spreadsheet and read the first two columns from the
;; price list sheet:
(->> (load-workbook "spreadsheet.xlsx")
     (select-sheet "Price List")
     (select-columns {:A :name, :B :price}))

;=> [{:name "Foo Widget", :price 100}, {:name "Bar Widget", :price 200}]

Example: Read a single cell

If you want to read a single cell value, you can use the select-cell function which takes an Excel-style cell reference (A2) and returns the cell. In order to get the actual value, use read-cell

(use 'dk.ative.docjure.spreadsheet)
(read-cell
 (->> (load-workbook "spreadsheet.xlsx")
      (select-sheet "Price List")
      (select-cell "A1")))

Example: Load a Workbook from a Resource

This example loads a workbook from a named file. In the case of running in the application server, the file typically resides in the resources directory, and it's not on the caller's path. To cover this scenario, we provide the function 'load-workbook-from-resource' that takes a named resource as the parameter. After a minor modification, the same example will look like:

(->> (load-workbook-from-resource "spreadsheet.xlsx")
     (select-sheet "Price List")
     (select-columns {:A :name, :B :price}))

Example: Load a Workbook from a Stream

The function 'load-workbook' is a multimethod, and the first example takes a file name as a parameter. The overloaded version of 'load-workbook' takes an InputStream. This may be useful when uploading a workbook to the server over HTTP connection as multipart form data. In this case, the web framework passes a byte buffer, and the example should be modified as (note that you have to use 'with-open' to ensure that the stream will be closed):

(with-open [stream (clojure.java.io/input-stream bytes)]
  (->> (load-workbook stream)
       (select-sheet "Price List")
       (select-columns {:A :name, :B :price})))

Example: Create a spreadsheet

This example creates a spreadsheet with a single sheet named "Price List". It has three rows. We apply a style of yellow background colour and bold font to the top header row, then save the spreadsheet.

(use 'dk.ative.docjure.spreadsheet)

;; Create a spreadsheet and save it
(let [wb (create-workbook "Price List"
                          [["Name" "Price"]
                           ["Foo Widget" 100]
                           ["Bar Widget" 200]])
      sheet (select-sheet "Price List" wb)
      header-row (first (row-seq sheet))]
  (set-row-style! header-row (create-cell-style! wb {:background :yellow,
                                                     :font {:bold true}}))
  (save-workbook! "spreadsheet.xlsx" wb))

Example: Create a workbook with multiple sheets

This example creates a spreadsheet with multiple sheets. Simply add more sheet-name and data pairs. To create a sheet with no data, pass nil as the data argument.

(use 'dk.ative.docjure.spreadsheet)

;; Create a spreadsheet and save it
(let [wb (create-workbook "Squares"
                          [["N" "N^2"]
                           [1 1]
                           [2 4]
                           [3 9]]
                          "Cubes"
                          [["N" "N^3"]
                           [1 1]
                           [2 8]
                           [3 27]])]
   (save-workbook! "exponents.xlsx" wb))

Example: Use Excel Formulas in Clojure

Docjure allows you not only to evaluate a formula cell in a speadsheet, it also provides a way of exposing a formula in a cell as a Clojure function using the cell-fn function.

(use 'dk.ative.docjure.spreadsheet)
;; Load a speadsheet and take the first sheet, construct a function from cell A2, taking
;; A1 as input.
(def formula-from-a2 (cell-fn "A2"
                                  (first (sheet-seq (load-workbook "spreadsheet.xlsx")))
                                  "A1"))

;; Returns value of cell A2, as if value in cell A1 were 1.0
(formula-from-a2 1.0)

Example: Handling Error Cells

If the spreadsheet being read contains cells with errors the default behaviour of the library is to return a keyword representing the error as the cell value.

For example, given a spreadsheet with errors:

(use 'dk.ative.docjure.spreadsheet)

(def sample-cells (->> (load-workbook "spreadsheet.xlsx")
                       (sheet-seq)
                       (mapcat cell-seq)))

sample-cells

;=> (#<XSSFCell 15.0> #<XSSFCell NA()> #<XSSFCell 35.0> #<XSSFCell 13/0> #<XSSFCell 33.0> #<XSSFCell 96.0>)

Reading error cells, or cells that evaluate to an error (e.g. divide by zero) returns a keyword representing the type of error from read-cell.

(->> sample-cells
     (map read-cell))

;=> (15.0 :NA 35.0 :DIV0 33.0 96.0)

How you handle errors will depend on your application. You may want to replace specific errors with a default value and remove others for example:

(->> sample-cells
     (map read-cell)
     (replace {:DIV0 0.0})
     (remove keyword?))

;=> (15.0 35.0 0.0 33.0 96.0)

The following is a list of all possible error values:

#{:VALUE :DIV0 :CIRCULAR_REF :REF :NUM :NULL :FUNCTION_NOT_IMPLEMENTED :NAME :NA}

Example: Iterating over spreadsheet data

A note on sparse data

It's worth understanding a bit about the underlying structure of a spreadsheet before you start iterating over the contents.

Spreadsheets are designed to be sparse - not all rows in the spreadsheet must physically exist, and not all cells in a row must physically exist. This is how you can create data at ZZ:65535 without using huge amounts of storage.

Thus each cell can be in 3 states - with data, blank, or nonexistent (null). There's a special type CellType.BLANK for blank cells, but missing cells are just returned as nil.

Similarly rows can exist with cells, or exist but be empty, or they can not exist at all.

Prior to Docjure 1.11 the iteration functions wrapped the underlying Apache POI iterators, which skipped over missing data - this could cause surprising behaviour, especially when there were missing cells inside tabular data.

Since Docjure 1.11 iteration now returns nil values for missing rows and cells - this is a breaking change - any code that calls row-seq or cell-seq now needs to deal with possible nil values.

Iterating over rows

You can iterate over all the rows in a worksheet with row-seq:

(->> (load-workbook "test.xls")
     (select-sheet "first")
     row-seq)

This will return a sequence of org.apache.poi.usermodel.Row objects, or nil for any missing rows. You can use (remove nil? (row-seq ...) ) if you are happy to ignore missing rows, but then be aware the nth result in the sequence might not match the nth row in the spreadsheet.

Iterating over cells

You can iterate over all the cells in a row with cell-seq - this returns a sequence of org.apache.poi.usermodel.Cell objects, or nil for missing cells. Note that (read-cell nil) returns nil so it's safe to apply read-cell to the results of cell-seq

(->> (load-workbook "test.xls")
     (select-sheet "first")
     row-seq
     (remove nil?)
     (map cell-seq)
     (map #(map read-cell %)))

For example, if you run the above snippet on a sparse spreadsheet like:

First Name Middle Name Last Name
Edger Allen Poe
(missing row)
John (missing) Smith

Then it will return:

(("First Name" "Middle Name" "Last Name")
 ("Edger" "Allen" "Poe")
 ("John" nil "Smith"))

Formatting: Adjust Column Width to Contents

You can adjust the column widths to they fit the contents. This makes generated workbooks look nicer.

For example, to auto-size all the columns in all the sheets in a workbook, use this:

;; wb is a workbook
(dorun (for [sheet (sheet-seq wb)]
             (auto-size-all-columns! sheet)))

To apply auto-width to individual columns in a sheet, use the auto-size-column! function.

Automatically get the Docjure jar from Clojars

The Docjure jar is distributed on Clojars. Here you can find both release builds and snapshot builds of pre-release versions.

If you are using the Leiningen build tool just add this line to the :dependencies list in project.clj to use it:

[dk.ative/docjure "1.14.0"]

Remember to issue the 'lein deps' command to download it.

Example project.clj for using Docjure 1.17.0

(defproject some.cool/project "1.0.0-SNAPSHOT"
      :description "Spreadsheet magic using Docjure"
      :dependencies [[org.clojure/clojure "1.10.0"]
                     [dk.ative/docjure "1.17.0"]])

Installation for Contributors

You need to install the Leiningen build tool to build the library. You can get it here: Leiningen

The library uses the Apache POI library which will be downloaded by the "lein deps" command.

Then build the library:

 lein deps
 lein compile
 lein test

To run the tests on all supported Clojure versions use:

lein all test

To check for security issues use:

lein nvd check

To check for new versions of dependencies:

lein ancient

Releasing to Clojars

When releasing a version to Clojars you must provide your user-name. The password is a deployment token, not your normal password. You can generate this by logging into Clojars. These tokens have an expiration date so if it does not work, log in a check if you need a new token.

    lein deploy clojars

You also need a GPG key to sign the releases.

These also expire, and must be periodically renewed. You can check your keys and their status like this:

    gpg --list-keys

See also lein help deploy and lein help gpg.

Remember to tag releases in git. You can list the tags with git tag -n and tag with git tag -a TAGNAME -m 'Tag comment'.

Build Status

Build Status

License

Copyright (c) 2009-2022 Martin Jul

Docjure is licensed under the MIT License. See the LICENSE file for the license terms.

Docjure uses the Apache POI library, which is licensed under the Apache License v2.0.

For more information on Apache POI refer to the Apache POI web site.

Contact information

Martin Jul

Contributors

This library includes great contributions from

Thank you very much!

Honorary Mention

A special thank you also goes out to people that did not contribute code but shared their ideas, reported security issues or bugs and otherwise inspired the continuing work on the project.

More Repositories

1

cucumber-tutorial

Setting up and using Cucumber for Clojure projects
Clojure
41
star
2

storm-tutorial

Clojure tutorial for Storm stream processing
Clojure
14
star
3

diesel

Diesel is a DSL toolkit for .NET code generation for DDD
C#
12
star
4

zeromq-tutorial

Clojure
10
star
5

SpracheParserCombinatorLab

Demonstration of the Sprache parser combinator for .NET
C#
5
star
6

logiclab

Examples and tutorial for the excellent Clojure core.logic logic programming library
Clojure
4
star
7

workflows-in-fsharp

Exploring workflow domain modelling in F#
F#
4
star
8

top-10-clojure-oredev-2011

Slides and code for my Øredev 2011 presentation
Clojure
3
star
9

cprnumre-dotnet

A CPR-nummer (plural: CPR-numre) is a Danish government citizien ID number. This library implements the CPR-nummer semantics.
F#
3
star
10

langchain-mistral-chat

Taking LangChain for a spin with the Mistral AI 7B model
Python
2
star
11

clojure-in-10-mins

Clojure in 10 Minutes - presentation and code
Clojure
2
star
12

cph-clj-web-dev

Copenhagen Clojure Meetup session on Web Development Nov. 4, 2010
Clojure
2
star
13

aws-lambda-steps

Playing with AWS Lambda Step Functions
Python
2
star
14

aws-lambda-fsharp-lab

Trying out F# on AWS Lambda.
Shell
2
star
15

heroku-zipurl

Example Clojure web app for the Heroku cloud
Clojure
2
star
16

top-10-clojure-ndc-2011

Top Ten Things to Learn from Clojure that will make you a better developer in any language. Presentation and code for my session at NDC 2011.
Clojure
2
star
17

fastapi-lab

Python FastAPI playground
Python
2
star
18

opencvlabpy

OpenCV playground with Python
Python
1
star
19

free-monad-interpreter-for-pascal

A free monad interpreter for a tiny subset of Pascal, in Rust
Rust
1
star
20

pinecone-canopy-lab

Taking Pinecone Canopy RAG search for a spin (RAG: Retrieval-Augmented Generation)
1
star
21

clojo-picasquez-and-velasso

Clojure version of the F# dojo, Dojo-Picasquez-and-Velasso
1
star
22

cloco

Clojure Cookbook - Clojure Recipes for Common Tasks
1
star
23

cph-clj-incanter

Graphing and Statistics with Incanter, Copenhagen Clojure Meetup, 17 Mar. 2011
1
star
24

fsharp-ikvm-docjure

Using the Docjure library from F# via IKVM
1
star
25

tensorflow-probability-lab

Tensorflow Probability experiments
Python
1
star
26

rustlab

Rust playground project
Rust
1
star
27

avro-dotnet-lab

Example of using Apache Avro on .NET
F#
1
star
28

clojo-fractal-forest

Clojure version of the dojo code for the F# Dojo-Fractal-Forest
Clojure
1
star
29

cph-clj-clojureql

Copenhagen Clojure Meetup on ClojureQL
1
star
30

clojure-communityday-2012

Clojure presentation and code for the Community Day 2012 conference in Copenhagen, May 10th.
Clojure
1
star
31

clojurescript-generative-testing

Generative testing with clojure.spec in Clojurescript
Clojure
1
star
32

fscheck-playground

F# FsCheck property-based testing
F#
1
star
33

xmlparser

Experimental XML parser in Clojure using a parser combinator.
Clojure
1
star
34

tensorflow-lab

Experimenting with Tensorflow
Python
1
star
35

polars-rust-lab

Using Polars time-series and data-frames in Rust
Rust
1
star