• Stars
    star
    133
  • Rank 272,600 (Top 6 %)
  • Language
    Clojure
  • Created about 15 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simplified XPath Library for Clojure

https://clojars.org/com.github.kyleburton/clj-xpath

Overview

clj-xpath is a library that makes it easier to work with XPath from Clojure.

Documentation

Documentation is available on GH Pages for clj-xpath

Description

Simplified XPath Library for Clojure. XML Parsers and an XPath implementation now comes with Java 6, though using the api directly can be verbose and confusing. This library provides a thin layer around basic parsing and XPath interaction for common use cases. I have personally found the ability to interactively tweak my xpath expressions to be a great productivity boost - even using this library only for that has helped me in my learning of and using xpath. I hope you find it useful and would love to hear your feedback and suggestions.

Usage

The main functions in the library are $x and those named with a prefix of $x: (eg: $x:text). The rationale for choosing $x as a name was based on the FireBug xpath function and it being a short and uncommon name. These xpath functions all take the xpath expression to be executed and an XML document. They attempt to be flexible with respect to the form of the XML document may represent. If it is a string it is treated as XML, if a byte array it is used directly, if already a Document or Node (from org.w3c.dom) they are used as-is.

There are four forms of most of the core functions, each with a different suffix borrowed from regular expression syntax: none, * + and ?. For example, $x:tag has the following four implementations:

  • ($x:tag "//books"): '1 and only 1', returns the single node found, throwing an exception if none or more than 1 are found.
  • ($x:tag? "//books"): '0 or 1', returns the single node found or nil, throwing an exception if more than 1 are found.
  • ($x:tag* "//book"): '0 or more', returns a sequence of the nodes found (which may be empty)
  • ($x:tag+ "//book"): '1 or more'returns a sequence of the nodes found, throwing an exception if none are found

If you are interested in the entire node found by the XPath expressions and not just in particular aspects the node (tag, attributes, text content), $x function returns a map containing the XML tag (as a symbol), dom Node, the text (as a string), and a map of the attributes where the keys have been converted into keywords and the values remain Strings.

(ns example
  (use [clj-xpath.core :only [$x $x:tag $x:text $x:attrs $x:attrs* $x:node]]))

(def *some-xml*
     "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<books>
  <book title=\"Some Guide To XML\">
    <author>
      <name>P.T. Xarnum</name>
      <email>[email protected]</email>
    </author>
    <description>
      Simply the most comprehensive XML Book on the market today.
    </description>
  </book>
  <book title=\"Some Guide To Functional Programming\">
    <author>
      <name>S. Hawking</name>
      <email>[email protected]</email>
    </author>
    <description>
      This book is too smart for you, try 'Head first Quantum Mechanics for Dummies' instead.
    </description>
  </book>
</books>")


;; get the top level tag:
(prn ($x:tag "/*" *some-xml*))
;; :books

;; find all :book nodes, pull the title from the attributes:
(prn (map #(-> % :attrs :title) ($x "//book" *some-xml*)))
;; ("Some Guide To XML" "Some Guide To Functional Programming")

;; same result using the $x:attrs* function:
(prn ($x:attrs* "//book" *some-xml* :title))
;; ("Some Guide To XML" "Some Guide To Functional Programming")

;; first select the :book element who's title has 'XML' in it
;; from that node, get and print the author's name (text content):
(prn ($x:text "./author/name"
              ($x:node "//book[contains(@title,'XML')]" *some-xml*)))
;; "P.T. Xarnum"

Parsing and XPath Compilation

The $x and related functions support Strings, and in many cases, other convenient types for these arguments. In all cases where it expects an XML Document it can be given a String, a byte array or a Document. Where an xpath expression is expected it will take either a String or a pre-compiled XPathExpression. The act of parsing an XML document or compiling an xpath expression is an expensive activity. With this flexibility, clj-xpath supports the convenience of in-line usage (with String data), as well as pre-parsed and pre-compiled instances for better performance.

  (let [expr (xp:compile "/*")
        doc  (xml->doc "<authors><author><name>P.T. Xarnum</name></author></authors>")]
    ($x:tag expr doc))

(xml->doc doc) => Document

This function takes xml that is of one of the following types and returns a Document: String, byte array or org.w3c.dom.Document. In cases of repeated usage of the document (eg: executing multiple xpath expressions against the same document) this will improve performance.

(xp:compile xpexpr) => javax.xml.xpath.XPathExpression

Pre-compiles the xpath expression. In cases of repeated execution of the xpath expression this will improve performance.

Validation

Validation now off by default. Validation is controlled by optional parameters passed to xml-bytes->dom, or by overriding the atom *validation* to false:

  (ns your.namespace
    (:use clj-xpath.core))

  (binding [*validation* false]
    ($x:text "/this" "<this>foo</this>"))

XPath and XML Namespaces

To use the xpath library with an XML document that utilizes XML namespaces, you can make use of the with-namespace-context macro providing a map of aliases to the xmlns URL:

  (def xml (slurp "fixtures/namespace1.xml"))
  (with-namespace-context {"atom" "http://www.w3.org/2005/Atom"}
    ($x:text "//atom:title" xml-doc))
  ;; => BookingCollection

There is also a utility function that can pull the namespace declarations from the root node of your XML document:

  (def xml (slurp "fixtures/namespace1.xml"))
  (with-namespace-context (xmlnsmap-from-root-node xml-doc)
    ($x:text "//atom:title" xml-doc))
  ;; => BookingCollection

These two examples assume the following XML document:

<atom:feed xml:base="http://nplhost:8042/sap/opu/sdata/IWFND/RMTSAMPLEFLIGHT/"
                  xmlns:atom="http://www.w3.org/2005/Atom"
                  xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
                  xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
                  xmlns:sap="http://www.sap.com/Protocols/SAPData">

<atom:title>BookingCollection</atom:title>
<atom:updated>2012-03-19T20:27:30Z</atom:updated>

<atom:entry>
<atom:author/>
</atom:entry>

<atom:entry>
<atom:author/>
<atom:content type="application/xml"/>
</atom:entry>

</atom:feed>

Changes

Version 1.4.13 : Sat Feb 25 12:37:09 2023 -0800
  • fixes #36 | Remove Apache Xalan dependency, no longer necessary for recent Java versions (from 7 onward)
  • drop support for clojure 1.6 and below
  • fix all flycheck warnings and errors
  • add Bakefile for local dev
Version 1.4.12 : Sun Dec 12 14:08:04 2021 -0800
  • fix log4j vulnerability
Version 1.4.11 : Sun Jan 1 09:56:21 PST 2017
Version 1.4.3 : Sat Sep 14 10:11:56 EDT 2013
  • Compatibility with Clojure 1.2, 1.3, 1.4, 1.5 and 1.6-SNAPSHOT
Version 1.4.1 : Sat Sep 7 21:10:16 EDT 2013
  • Support leiningen 2
  • create profiles for clojure 1.2 through 1.6
  • resolve reflection warnings: NB: two remain for clojure 1.3
Version 1.4.1 : Sat Feb 16 12:15:26 EST 2013

Changed project group from org.clojars.kyleburton to com.github.kyleburton.

Version 1.4.0 : Tue Dec 18 15:10:19 EST 2012
  • :children lazy seq of a Node's children added by mtnygard
  • idiomatic use of next

Hacking

# to run an nrepl that you can connect to with `M-x cider-connect-clj`:

# if you have bake installed
$ bake nrepl

# with leiningen
$ lein with-profile dev run -m clj-xpath.nrepl

Deploying

Create ~/.lein/credentials.clj

{#"https://repo.clojars.org"
 {:username "<<clojars-user-name>>" :password "CLOJARS_<<deploy-token>>"}}

Your clojars deploy tokens are managed at https://clojars.org/tokens

Encrypt it:

$ gpg --default-recipient-self -e ~/.lein/credentials.clj > ~/.lein/credentials.clj.gpg

Verify

$ gpg --decrypt ~/.lein/credentials.clj.gpg

Deploy

$ lein deploy clojars
$ lein release

Authors

More Repositories

1

jrclj

JRuby Clojure Bridge
Ruby
76
star
2

clj-bloom

Bloom Filter implementation in Clojure
Clojure
59
star
3

clj-etl-utils

ETL Utilities for Clojure
Clojure
30
star
4

bake

Pure bash, very lightweight scripting and build framework.
Shell
29
star
5

sandbox

Sandbox for trying out new technologies, techniques and things that haven't yet made it into a formal project.
Clojure
24
star
6

large-data-and-clojure

Large Data and Clojure: the middle ground between RAM and EC2
Clojure
24
star
7

twilio-in-ten-minutes

Create and deploy a Twilio Interactive Voice Prompt System with Ruby, Rails and Heroku in Ten minutes.
Ruby
20
star
8

fuzzy-string

Talk: Survey of Fuzzy String Matching Algorithms
Ruby
13
star
9

dead-mans-snitch

Make working with Dead Man's Snitch as easy as a require.
Ruby
9
star
10

krbemacs

My Emacs Configuration and libraries.
Emacs Lisp
7
star
11

impresario

Simple Workflow Library for Clojure
Clojure
7
star
12

transition.js

Detangled in-Browser Webapp Testing
JavaScript
6
star
13

abstract-tables

Ruby ETL Utilities
Ruby
6
star
14

tellmewhen

Unix Utility to Notify when another command completes
Go
6
star
15

teporingo

Clojure RabbitMQ Client with a focus on HA Configurations.
C
6
star
16

clj-lfsr

Linear Feedback Shift Register in Clojure
Clojure
5
star
17

intro-to-extending-ejabberd

Notes on how we extended Ejabberd
Erlang
4
star
18

cucumber-example

Show how to test a Rails app using Cucumber to drive Google Chrome to do feature driven development.
Ruby
4
star
19

kyle-thinks-about-interviewing

My technical interviewing stories and techniques.
4
star
20

introduction-to-git

Slides and materials for the Git Introduction Talk
Ruby
3
star
21

grake

Experimental: Rake-like build tool implement in go (golang)
Go
3
star
22

base-app

Base command line application for Ruby
Ruby
3
star
23

typrtail

Hey! It looks like you're typing but your fingers aren't moving!
Ruby
3
star
24

krb-bash-utils

My Bash utilities and profile settings (including git status in the prompt)
Shell
3
star
25

intro-to-genetic-algorithms

Introduction to Genetic Algorithms Talk
JavaScript
2
star
26

ruby-data-fu

Ideas for a Ruby Data-Fu presentation, cmdnline data processing, ETL, etc.
Shell
2
star
27

adventures-in-etl

Talk being developed for $WORK on my expeirence and anti-usual thoughts about ETL
2
star
28

jquery-primer

Primer on jQuery, HTML, CSS
JavaScript
2
star
29

dev-utils

Development (Unix Shell) Utilities
Ruby
2
star
30

lein-margauto

Leiningen plugin: Autobuilder and simple server for Marginalia documentation.
Clojure
2
star
31

presenting-chef-solo

Leveraging Chef-Solo for Less than Enterprise Scale Deployments
1
star
32

kyleburton-mvn-repo

Maven Repo for my projects
1
star
33

mongodb-sandbox

Testin out mongodb.
Ruby
1
star
34

leiningen-plugins-talk

Slides and other assets from the Leiningen Plugins Talk
1
star
35

go-abtab

Abstract Tables in Go
Go
1
star
36

corpus-enormous

Randomized Data Set Generator
HTML
1
star
37

kburton.blog

Repo for blog ideas in development
1
star
38

jbit

Large Bit Set for Java
Java
1
star
39

cassandra-sandbox

Proof of concept for Cassandra
Ruby
1
star
40

system-utils

Unix system / administrative utilities
Shell
1
star
41

did-i-ever-show-you-my-ide

Idea for a talk about my tools and workflow.
Ruby
1
star
42

logback-riemann-appender

logback-riemann-appender
Java
1
star
43

kyleburton.github.com

Github Website
1
star
44

keikai

Clojure
1
star
45

emacs-workshop

Emacs Workshop - install, get slime+clojure running
Ruby
1
star
46

api-proxy

Clojure based Proxy for helping me develop local static assets against an API on a remote server.
Clojure
1
star
47

repl-from-java

CIDER + NRepl as a single Dependency, Inject into your JVM projects at will!
Java
1
star