• Stars
    star
    173
  • Rank 220,124 (Top 5 %)
  • Language
    Perl
  • Created over 14 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Catmandu - a data processing toolkit

NAME

Catmandu::Introduction - A 5 minute introduction to Catmandu

HELLO WORLD

$ catmandu convert Null --fix 'add_field(hello,world)'
[{"hello":"world"}]    

The example above generates the JSON output [{"hello":"world"}] on the standard output. We asked the Catmandu processor to convert an empty input (Null) and add one property hello with value world.

We can ask Catmandu not to generate the default JSON output but convert to a YAML output:

$ catmandu convert Null --fix 'add_field(hello,world)' to YAML
---
hello: world
...  

FORMAT to FORMAT

Catmandu can be used to convert an input format to an output format. Use the keyword to on the command line:

$ cat file.yaml
---
hello: world
... 
$ catmandu convert YAML to JSON < file.yaml
[{"hello":"world"}]  

The left part of the to keyword is called the Importer, the right part of the to keyword is called the Exporter. Catmandu provides Importers and Exports for many formats.

OPTIONS

Each Importer and Exporter can have options that change the behavior of conversion. The options can be read using the perldoc command on each Importer and Exports:

perldoc Catmandu::Importer::YAML
perldoc Catmandu::Exporter::JSON

Note, many formats are available as Importer and Exporter.

As an example, we can use a JSON Exporter option pretty to provide a pretty printed version of the JSON:

$ catmandu convert YAML to JSON --pretty 1 < file.yaml
[{ 
    "hello" : "world"
}]

FIX LANGUAGE

Many data conversions need a mapping from one field to another field plus optional conversions of the data inside these fields. Catmandu provides the Fix language to assist in these mappings. A full list Fix functon is available at https://librecat.org/assets/catmandu_cheat_sheet.pdf.

Fixes can be provided inline as text argument of the command line --fix argument, or as a pointer to a Fix Script. A Fix Scripts groups one or more fixes in a file.

$ cat example.fix
add_field('address.street','Walker Street')
add_field('address.number','15')
copy_field('colors.2','best_color')

$ cat data.yaml
---
colors:
- Red
- Green
- Blue
...

$ catmandu convert YAML --fix example.fix to YAML < data.yaml
---
address:
    number: '15'
    street: Walker Street
best_color: Blue
colors:
    - Red
    - Green
    - Blue
...

In the example we created the Fix Script example.fix that contains a combination of mappings and data conversion on (nested) data. We run a YAML to YAML conversion using the example.fix Fix Script.

SPECIALIZATIONS

Catmandu was mainly created for data conversions of specialized metadata languages in the field of libraries, archives and museums. One of the specialized Importers (and Export) is the Catmandu::MARC package. This package can read, write and convert MARC files.

For instance, to extract all the titles from an ISO MARC file one could write:

$ cat titles.fix
marc_map('245',title)
retain(title)

$ catmandu convert MARC --type ISO --fix titles.fix to CSV < data.mrc

The marc_map is a specialized Fix function for MARC data. In the example above the 245 field of each MARC record is mapped to the title field. The retain Fix function keeps only the title field in the output.

TUTORIAL

A 18 day tutorial on Catmandu and the Fix language is available at https://librecatproject.wordpress.com/tutorial/.

More information is also available in our wiki https://github.com/LibreCat/Catmandu/wiki

More Repositories

1

LibreCat

A publication management system
Perl
43
star
2

Catmandu-MARC

Catmandu modules for working with MARC data
Perl
8
star
3

Catmandu-Store-Elasticsearch

Perl
6
star
4

docker-catmandu

Docker image for the Catmandu data toolkit
Perl
5
star
5

Catmandu-SRU

Catmandu module for working with SRU data.
Perl
5
star
6

Catmandu-RDF

Catmandu modules for working with RDF data
Perl
5
star
7

Catmandu-Examples

Examples of Catmandu programs and Fix scripts
Perl
4
star
8

Catmandu-Projects

Perl
4
star
9

Catmandu-XML

Catmandu modules for working with XML data
Perl
4
star
10

catmandu-the-book

Experiment in creating a book from POD and GitHub wiki
Perl
4
star
11

Catmandu-Store-MongoDB

A searchable store backed by MongoDB
Perl
4
star
12

Catmandu-OAI

Catmandu modules for working with OAI repositories
Perl
3
star
13

Catmandu-Zotero

Catmandu support for Zotero WEB
Perl
3
star
14

MARC2RDF

Example Fix scripts for transforming MARC to RDF triples
mIRC Script
3
star
15

Catmandu-RIS

Catmandu modules for working with RIS data
Perl
3
star
16

Dancer-Plugin-Catmandu-OAI

OAI-PMH provider backed by a searchable Catmandu::Store
Perl
2
star
17

embedgenerator

Perl
2
star
18

Catmandu-Identifier

Namespace for handling/fixing identifier, e.g. ISBN, ISSN
Perl
2
star
19

Catmandu-XSD

Catmandu tools to process and XML Schema (XSD) based input
Perl
2
star
20

Catmandu-BibTeX

Catmandu modules for working with bibtex data
Perl
2
star
21

Catmandu-PNX

Catmandu tools to process Primo normalized XML (PNX) records
Perl 6
2
star
22

librecat.github.io

New LibreCat website
HTML
2
star
23

imaging

Perl
2
star
24

csl

Perl
2
star
25

MARC-Conversion-Examples

Scripts and examples how to convert MARC records
2
star
26

Catmandu-Blacklight

Modules to retrieve/store data in a Blacklight catalog
Perl
1
star
27

repo_tweet

Perl
1
star
28

librecat-hfh-demo

demo for dancer and HTML::FormHandler
Perl
1
star
29

Plack-Middleware-Memento-Handler-Catmandu-Bag

Perl
1
star
30

csl-server

JavaScript
1
star
31

Catmandu-Z3950

Catmandu module for working with Z3950 data.
Perl
1
star
32

Catmandu-Importer-OpenAIRE

Package that queries the OpenAIRE Graph
Perl
1
star
33

citeproc-ringo

citeproc server powered by ringo.js
JavaScript
1
star
34

Blog

LibreCat-Blog is an example Dancer + ElasticSearch + Catmandu project
JavaScript
1
star
35

Catmandu-Store-Lucy

Perl
1
star
36

Catmandu-Atom

Catmandu-Atom - modules for working with Atom feeds
Perl
1
star
37

Catmandu-FedoraCommons

Perl Fedora Commons REST API tools
Perl
1
star
38

Catmandu-Stat

Catmandu support for basic statistical data analysis
Perl
1
star
39

Catmandu-Cmd-repl

Perl
1
star
40

Catmandu-BagIt

Catmandu module for working with BagIt packages
Perl
1
star
41

Catmandu-LDAP

Perl
1
star
42

Catmandu-LIDO

Catmandu tools for working with LIDO data
Perl
1
star
43

Catmandu-Cmd-fuse

Perl
1
star
44

Catmandu-MODS

Catmandu Importer for importing MODS records, with use of the CPAN module MODS::Record
Perl
1
star
45

cerif

Example scripts to export CERIF data
Perl
1
star
46

Plack-Middleware-Memento

Perl
1
star
47

Catmandu-Twitter

Catmandu importer for Twitter feeds
Perl
1
star
48

Catmandu-HTML

Catmandu modules to process HTML files
HTML
1
star
49

Catmandu-Inspire

Catmandu modules for working with Inspire data
Perl
1
star
50

Catmandu-ORCID

Catmandu Importer for publications registered at the ORCID registry
Perl
1
star
51

Dancer-Plugin-Catmandu-SRU

SRU server backed by a searchable Catmandu::Store
Perl
1
star
52

Catmandu-Breaker

Package that exports data in a Breaker format
Perl
1
star
53

catmandu-notebook

Dockerfile for running a docker container with Catmandu in Jupyter notebook
Jupyter Notebook
1
star