• This repository has been archived on 05/Sep/2024
  • Stars
    star
    230
  • Rank 174,053 (Top 4 %)
  • Language
    Go
  • License
    MIT License
  • Created over 9 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Interface to libxml2, with DOM interface

libxml2

Interface to libxml2, with DOM interface.

Build Status

GoDoc

Index

Why?

I needed to write go-xmlsec. This means we need to build trees using libxml2, and then muck with it in xmlsec: Two separate packages in Go means we cannot (safely) pass around C.xmlFooPtr objects (also, you pay a penalty for pointer types). This package carefully avoid references to C.xmlFooPtr types and uses uintptr to pass data around, so other libraries that needs to interact with libxml2 can safely interact with it.

Status

  • This library should be considered alpha grade. API may still change.
  • Much of commonly used functionalities from libxml2 that I use are there already, and are known to be functional

Package Layout:

Name Description
libxml2 Globally available utility functions, such as ParseString
types Common data types, such as types.Node
parser Parser routines
dom DOM-like manipulation of XML document/nodes
xpath XPath related tools
xsd XML Schema related tools
clib Wrapper around C libxml2 library - DO NOT TOUCH IF UNSURE

Features

Create XML documents using DOM-like interface:

  d := dom.CreateDocument()
  e, err := d.CreateElement("foo")
  if err != nil {
    println(err)
    return
  }
  d.SetDocumentElement(e)
  ...

Parse documents:

  d, err := libxml2.ParseString(xmlstring)
  if err != nil {
    println(err)
    return
  }

Use XPath to extract node values:

  text := xpath.String(node.Find("//xpath/expression"))

Examples

Basic XML Example

import (
  "log"
  "net/http"

  "github.com/lestrrat-go/libxml2"
  "github.com/lestrrat-go/libxml2/parser"
  "github.com/lestrrat-go/libxml2/types"
  "github.com/lestrrat-go/libxml2/xpath"
)

func ExampleXML() {
  res, err := http.Get("http://blog.golang.org/feed.atom")
  if err != nil {
    panic("failed to get blog.golang.org: " + err.Error())
  }

  p := parser.New()
  doc, err := p.ParseReader(res.Body)
  defer res.Body.Close()

  if err != nil {
    panic("failed to parse XML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  root, err := doc.DocumentElement()
  if err != nil {
    log.Printf("Failed to fetch document element: %s", err)
    return
  }

  ctx, err := xpath.NewContext(root)
  if err != nil {
    log.Printf("Failed to create xpath context: %s", err)
    return
  }
  defer ctx.Free()

  ctx.RegisterNS("atom", "http://www.w3.org/2005/Atom")
  title := xpath.String(ctx.Find("/atom:feed/atom:title/text()"))
  log.Printf("feed title = %s", title)
}

Basic HTML Example

func ExampleHTML() {
  res, err := http.Get("http://golang.org")
  if err != nil {
    panic("failed to get golang.org: " + err.Error())
  }

  doc, err := libxml2.ParseHTMLReader(res.Body)
  if err != nil {
    panic("failed to parse HTML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  nodes := xpath.NodeList(doc.Find(`//div[@id="menu"]/a`))
  for i := 0; i < len(nodes); i++ {
    log.Printf("Found node: %s", nodes[i].NodeName())
  }
}

XSD Validation

import (
  "io/ioutil"
  "log"
  "os"
  "path/filepath"

  "github.com/lestrrat-go/libxml2"
  "github.com/lestrrat-go/libxml2/xsd"
)

func ExampleXSD() {
  xsdfile := filepath.Join("test", "xmldsig-core-schema.xsd")
  f, err := os.Open(xsdfile)
  if err != nil {
    log.Printf("failed to open file: %s", err)
    return
  }
  defer f.Close()

  buf, err := ioutil.ReadAll(f)
  if err != nil {
    log.Printf("failed to read file: %s", err)
    return
  }

  s, err := xsd.Parse(buf)
  if err != nil {
    log.Printf("failed to parse XSD: %s", err)
    return
  }
  defer s.Free()

  d, err := libxml2.ParseString(`<foo></foo>`)
  if err != nil {
    log.Printf("failed to parse XML: %s", err)
    return
  }
  defer d.Free()

  if err := s.Validate(d); err != nil {
    for _, e := range err.(xsd.SchemaValidationError).Errors() {
      log.Printf("error: %s", e.Error())
    }
    return
  }

  log.Printf("validation successful!")
}

Caveats

Other libraries

There exists many similar libraries. I want speed, I want DOM, and I want XPath.When all of these are met, I'd be happy to switch to another library.

For now my closest contender was xmlpath, but as of this writing it suffers in the speed (for xpath) area a bit:

shoebill% go test -v -run=none -benchmem -benchtime=5s -bench .
PASS
BenchmarkXmlpathXmlpath-4     500000         11737 ns/op         721 B/op          6 allocs/op
BenchmarkLibxml2Xmlpath-4    1000000          7627 ns/op         368 B/op         15 allocs/op
BenchmarkEncodingXMLDOM-4    2000000          4079 ns/op        4560 B/op          9 allocs/op
BenchmarkLibxml2DOM-4        1000000         11454 ns/op         264 B/op          7 allocs/op
ok      github.com/lestrrat-go/libxml2  37.597s

FAQ

"It won't build"

The very first thing you need to be aware is that this is a C binding to libxml2. You should understand how to build C programs, how to debug them, or at least be able to ask the right questions and deal with a great deal more than Go alone.

Having said that, the most common causes for build errors are:

  1. You have not installed libxml2 / You installed it incorrectly

The first one is obvious, but I get this a lot. You have to install libxml2. If you are installing via some sort of package manager like apt/apk, remember that you need to install the "development" files as well. The name of the package differs in each environment, but it's usually something like "libxml2-dev".

The second is more subtle, and tends to happen when you install your libxml2 in a non-standard location. This causes problems for other tools such as your C compiler or pkg-config. See more below

  1. Your header files are not in the search path

If you don't understand what header files are or how they work, this is where you should either look for your local C-guru, or study how these things work before filing an issue on this repository.

Your C compiler, which is invoked via Go, needs to be able to find the libxml2 header files. If you installed them in a non-standard location, for example, such as outside of /usr/include and /usr/local/include, you may have to configure them yourself.

How to configure them depends greatly on your environment, and again, if you don't understand how you can fix it, you should consult your local C-guru about it, not this repository.

  1. Your pkg-config files are not in the search path

If you don't understand what pkg-config does, this is where you should either look for your local sysadmin friend, or study how these things work before filing an issue on this repository.

pkg-config provides metadata about a installed components, such as build flags that are required. Go uses it to figure out how to build and link Go programs that needs to interact with things written in C.

However, pkg-config is merely a thin frontend to extract information from file(s) that each component provided upon installation. pkg-config itself needs to know where to find these files.

Make sure that the output of the following command contains libxml-2.0. If not, and you don't understand how to fix this yourself, you should consult your local sysadmin friend about it, not this repository

pkg-config --list-all

"Fatal error: 'libxml/HTMLparser.h' file not found"

See the first FAQ entry.

I can't build this library statically

See prior discussion: #62

See Also

Credits

More Repositories

1

jwx

Implementation of various JWx (Javascript Object Signing and Encryption/JOSE) technologies
Go
1,908
star
2

file-rotatelogs

[ARCHIVED] Port of perl5 File::RotateLogs to Go
Go
964
star
3

server-starter

Go port of start_server utility (Server::Starter)
Go
215
star
4

backoff

Backoff mechanics for Go
Go
187
star
5

strftime

Fast strftime for Go
Go
110
star
6

test-mysqld

Create real MySQL server instance for testing
Go
88
star
7

slack

Slack client for go
Go
85
star
8

jsschema

JSON Schema for Go
Go
66
star
9

xslate

Powerful Template Engine for Go (port Perl5's Text::Xslate)
Go
65
star
10

jsval

Validator toolset, aimed to be used with JSON Schema
Go
54
star
11

apache-logformat

Port of Perl5's Apache::LogFormat::Compiler to golang
Go
48
star
12

sharaq

Image Transformer
Go
35
star
13

fluent-client

A fluentd client
Go
34
star
14

xmlsec

xmlsec1 binding for golang
Go
29
star
15

openapi

[WIP] OpenAPI for Go
Go
24
star
16

xstrings

Unicode-aware string utilities for Go
Go
22
star
17

jsref

JSON Reference Implementation for Go
Go
20
star
18

ngram

Ngram for golang
Go
19
star
19

cron

Dispatch jobs cron-style
Go
19
star
20

pdebug

Utilities for my print debugging fun. YMMV
Go
17
star
21

scripting

Handy toolset when using Go as a shell script replacement
Go
17
star
22

httprc

Quasi Up-to-date HTTP In-memory Cache
Go
16
star
23

msgpack

A `msgpack` serializer and deserializer
Go
16
star
24

tcputil

Some Utilities To Help Your TCP-Related testing
Go
16
star
25

hsup

Generate scaffold web app from JSON Hyper Schema files
Go
15
star
26

multifs

Create an fs.FS instance that "mounts" other fs.FS
Go
15
star
27

ical

Work with ical formatted data
Go
12
star
28

gcp-auto-lb-clean

Delete Dangling GCP Load Balancers Created By GKE
Go
12
star
29

channels

Channel patterns
Go
11
star
30

urlenc

Marshal/Unmarshal interface for structs that can encode/decode themselves to URL query strings
Go
11
star
31

tcptest

Start A Network Server On Random Local Port (Port of Perl5's TCP::Test)
Go
11
star
32

helium

(Work In Progress) An exercise rewriting libxml2 in Go
Go
10
star
33

naivebayes

Yet Another Naive-Bayesian filter algorithm
Go
9
star
34

echo-middleware-jwx

Echo (labstack/echo) middleware for using github.com/lestrrat-go/jwx
Go
9
star
35

option

Base option type
Go
9
star
36

gettext

Go
8
star
37

config

Libraries to aid configuration for golang applications
8
star
38

iter

Iterator tools
Go
8
star
39

jshschema

JSON Hyper Schema for Go
Go
8
star
40

lex

Simple lexer for Go
Go
8
star
41

jspointer

JSON pointer for Go
Go
8
star
42

fsnotify

fsnotify for Go
Go
7
star
43

sqllib

Maintain a library of prepared SQL statements (*sql.Stmt)
Go
6
star
44

structinfo

Tools to inspect Go structs
Go
6
star
45

rotating

A representation of a file that knows how to rotate itself based on size/interval
Go
6
star
46

blackmagic

Reflect-based black magic
Go
5
star
47

mux

A minimalistic HTTP Router for Go
Go
5
star
48

bufferpool

Very simple bytes.Buffer pool using sync.Pool
Go
5
star
49

pubsub

Simple broadcast pattern
Go
5
star
50

fear-of-go

4
star
51

sandbox

My tests, benchmarks
Go
4
star
52

tmplbox

Yet Another (text|html)/template wrapper
Go
4
star
53

packasset

Lightweight resource embedding utility for Go
Go
4
star
54

runcmd

Wrapper for "os/exec".Command
Go
4
star
55

sketch

Generate JSON (De)serializable Object From Go Schema
Go
4
star
56

merror

Error representing multiple errors
Go
4
star
57

envload

Restore and load environment variables
Go
3
star
58

lmdb

[WIP] Go Binding to LMDB
Go
3
star
59

strcursor

Tools for when you need to perform character-wise parsing
Go
3
star
60

httpcc

HTTP/1.1 Cache-Control Header Parser
Go
3
star
61

astv

An Alternate Go AST Visitor
Go
2
star
62

openscad

OpenSCAD Code Generator and Linter written in Go
Go
2
star
63

byteslice

Easy to handle `[]byte` type in your Go JSON structs
Go
2
star
64

codegen

Utilities for generating Go code
Go
2
star
65

dataurl

Parse/Encode RFC2397 "data" URL scheme
Go
2
star
66

contributions

2
star
67

rungroup

Control execution of multiple goroutines
Go
2
star
68

openscad-mcad

Port of openscad/mcad to github.com/lestrrat-go/openscad
Go
1
star
69

spanner-model

Go
1
star
70

rescue

Go
1
star
71

jsonptr

[WIP] RFC6901
Go
1
star
72

spanner-emulator-driver

Start/Stop Spanner Emulator from Go code
Go
1
star
73

gnap

Move along, there's nothing to see here yet
Go
1
star
74

jwx-aws-kms

Use KMS services as signer/verifier for github.com/lestrrat-go/jwx
Go
1
star
75

epd

[WIP] Port of ePaper HAT binding
Go
1
star
76

accesslog

HTTP middleware to log access logs based on `log/slog` for Go
Go
1
star
77

scriptor

Framework to construct a scripted sequence of actions, such as for testing
Go
1
star