• Stars
    star
    321
  • Rank 130,752 (Top 3 %)
  • Language
    Clojure
  • License
    GNU Affero Genera...
  • Created about 5 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

org-parser is a parser for the Org mode markup language for Emacs.

General

Tests:

Clojars Project

Community chat: #organice on IRC Libera.Chat, or #organice:matrix.org on Matrix

What does this project do?

org-parser is a parser for the Org mode markup language for Emacs.

It can be used from JavaScript, Java, Clojure and ClojureScript!

Why is this project useful / Rationale

Org mode in Emacs is implemented in org-element.el (API documentation). The spec for the Org syntax is written in prose.

This is already great work, yet it has some drawbacks:

  1. The spec is not machine readable. Hence, there can be drift between documentation and implementation. In fact, during the development of organice, our web-based Org implementation with great mobile phone support, and org-parser we have encountered drift.
  2. org-element.el is naturally written in Emacs lisp and makes strong use of Emacs as a text-processor. Hence, its code can only be used within Emacs.

While writing the official spec already is an amazing effort in the standardization of the Org format, the power of Org is so enticing that many want to use it outside of Emacs, as well. Since org-element.el only runs in Emacs, this caused a myriad of implementations for other platforms (JavaScript, Rust, Go, Java, etc) to have been created. Most implementations are only partial, and more importantly each of them creates another island. Since they are just as programming language dependent as org-element.el, it is impossible to share logic between them.

org-parser aims at alleviating both these issues. It documents the syntax in a standard and machine readable notation (EBNF). And the reference implementation is done in a way that it runs on the established virtual machines of Java and JavaScript. Hence, org-parser can be used from all programming languages running on those virtual machines. org-parser provides a higher-level data structure that is easy to consume for an application working with Org mode data. Even if your application is not running on the Java or JavaScript virtual machines, you can embed org-parser as a command-line application. Lastly, org-parser brings a strong test suite to document the reference implementation in yet another unambiguous way.

It is our aim that org-parser can be the foundation on which many Org mode applications in many different languages can be built. The applications using org-parser can then focus on implementing user facing features and don’t have to worry about the implementation of the Org syntax itself.

Architecture

The code base of org-parser is split into four namespaces:

  • org-parser.core (top level api, i.e. read-str, write-str)
  • org-parser.parse (aka. deserializer, reader)
  • org-parser.parse.transform (transforms the result of the parser into a more desirable structure)
  • org-parser.render (aka. serializer, writer)

Thus org-parser has become a misnomer in the sense, that it now strives to be clojure/data.org (after the pattern of existing Clojure libraries like data.json, data.xml, data.csv, etc) providing reader as well as writer capabilities for the serialization format org.

Project State

This project is work-in-progress. It is not ready for production yet because the structure of the AST (parse tree) can still change.

The biggest milestones are:

  • [X] Finish EBNF parser to support most Org mode syntax
    • [X] Headlines
    • [X] Org mode #+* stuff
    • [X] Timestamps
    • [X] Links
    • [X] Text links
    • [X] Footnotes
    • [X] Styled text
    • [X] Drawers and #+BEGIN_xxx blocks
    • [ ] Nested markup (see #12)
  • [X] Setup basic transformation from the parse tree to a higher-level structure.
  • [-] Transformations to higher-level structure: catch up with features that are already supported by the EBNF parser.
  • [-] Render parsed org file with write-str

It can already be useful for you: E.g. if your script needs to parse parts of Org mode features, our EBNF parser probably already supports that. Do not underestimate e.g. timestamps. Use our well-tested parser to disassemble it in its parts, instead of trying to write a poor and ugly regex that is only capable of a subset of Org mode’s timestamps ;)

Don’t hesitate to contribute!

Development

org-parser uses instaparse which aims to be the simplest way to build parsers in Clojure. Apart from living up to this claim (and beyond the scope of just the one programming language), using instaparse is great for another reason: Instaparse works both on CLJ and CLJS. Therefore org-parser can be used from both ecosystems which, of course, include JavaScript and Java. Hence, it is possible to use it in various situations.

Prerequisites

Please install Clojure and Leiningen.

There’s no additional installation required. Leiningen will pull dependencies if required.

Testing

Running the tests:

# Clojure
lein test
# CLJS (starts a watcher)
lein doo node

If you’re not familiar with Lisp or Clojure, here’s a short video on how the tooling for Lisp (and hence Clojure) is great and enables fast developer feedback and high quality applications. Initially, the video was created to answer a specific issue on this repository. However, the question is a valid general question that is asked quite often by people who haven’t used a Lisp before.

https://raw.githubusercontent.com/200ok-ch/org-parser/master/doc/images/quick_introduction_to_lisp_clojure_and_using_the_repl.jpg

You can watch it here: https://youtu.be/o2MLHFGUkoQ

Release and Dependency Information

Note: The version number should be replaced with the current version of org-parser. See the clojars badge at the top of this README.

CLI/deps.edn dependency information:

org-parser/org-parser {:mvn/version "0.1.4"}

Leiningen dependency information:

[org-parser "0.1.4"]

Usage

At the moment, you can run org-parser from Clojure, ClojureScript, or Java. Other targets which are hosted on the JVM or on JavaScript are possible.

Clojure Library

(ns hello-world.core
  (:require [org-parser.parser :refer [parse]]
            [org-parser.core :refer [read-str write-str]]))

(prn (parse "* Headline"))
(prn (read-str "* Headline"))
(println (write-str (read-str "* Headline")))
[:S [:headline [:stars “*”] [:text [:text-normal “Headline”]]]]
{:headlines [{:headline {:level 1, :title :text-normal “Headline”, :planning [], :tags []}}]}
”* Headline\n”

Clojure

Run lein run file.org, for example:

lein run test/org_parser/fixtures/schedule_with_repeater.org
{:headlines [{:headline {:level 1, :title [[:text-sty-bold "Header"] [:text-normal " with repeater"]], :planning [[:planning-info [:planning-keyword [:planning-kw-scheduled]] [:timestamp-active [:ts-inner [:ts-inner-wo-time [:ts-date "2019-11-27"] [:ts-day "Wed"]] [:ts-modifiers [:ts-repeater [:ts-repeater-type "+"] [:ts-mod-value "1"] [:ts-mod-unit "d"]]]]]]], :tags []}}]}

Java

First, compile org-parser with:

lein uberjar

Then run java -jar target/uberjar/org-parser-*-SNAPSHOT-standalone.jar file.org, for example:

java -jar target/uberjar/org-parser-*-SNAPSHOT-standalone.jar test/org_parser/fixtures/schedule_with_repeater.org
{:headlines [{:headline {:level 1, :title [[:text-sty-bold "Header"] [:text-normal " with repeater"]], :planning [[:planning-info [:planning-keyword [:planning-kw-scheduled]] [:timestamp-active [:ts-inner [:ts-inner-wo-time [:ts-date "2019-11-27"] [:ts-day "Wed"]] [:ts-modifiers [:ts-repeater [:ts-repeater-type "+"] [:ts-mod-value "1"] [:ts-mod-unit "d"]]]]]]], :tags []}}]}

Note: The * character must be replaced with the current version number of org-parser. See the clojars badge at the top of this README.

License

More Repositories

1

organice

An implementation of Org mode without the dependency of Emacs - built for mobile and desktop browsers
JavaScript
2,424
star
2

p_slides

dead simple way to create semantic, nice to look at slides - without any dependencies
JavaScript
147
star
3

counsel-jq

Traverse complex JSON and YAML structures with live feedback
Emacs Lisp
126
star
4

talks

Resources to various talks given by 200ok team members at various locations.
TeX
38
star
5

easy

Evented Accounting Sourced from Yaml
Clojure
12
star
6

iCalendar2web

Render an iCalendar feed as a HTML calendar or as JSON
Ruby
9
star
7

ukko

The axe of perun - a static site generator.
Clojure
8
star
8

pdf-viewer

A Web Component utilizing React providing a convenient <pdf-viewer> tag based on PDFjs. Written in ClojureScript.
Java
6
star
9

oktags

Manage tags on plain old files
Ruby
4
star
10

webhooks

A minimalistic webhooks server in Clojure
Clojure
4
star
11

tractatus

Specify and work with resources in Clojure
Clojure
4
star
12

json2xlsx

Converts json to xlsx
Clojure
4
star
13

gh-events

Determine Github event types by their payload and translate them into textual representations
Ruby
4
star
14

fsdb

A Clojure library that provides a reasonably convenient readonly database on top of the file system.
Clojure
3
star
15

voice_vault

voice_vault enables you to record and archive all your meetings and conversations with ease. Later, search through them with lightning speed using full-text search.
Ruby
3
star
16

okdoc

A self-contained document scanning and archiving solution.
Ruby
3
star
17

oklint

A rather generic, easily extendable linter for everything™️
JavaScript
2
star
18

okshot

Take and annotate screenshots in Linux with standard tools.
Ruby
2
star
19

media_feed

A media feed curated by the people at 200ok and friends.
Python
2
star
20

fog-cloudian

Fog API wrapper for the Cloudian Hyperstore REST APIs
Ruby
2
star
21

event_scheduler

Schedule recurring (multi day) events in JavaScript.
JavaScript
2
star
22

yaml_extensions

Extensions to make YAML even more useful
Ruby
1
star