• Stars
    star
    291
  • Rank 137,437 (Top 3 %)
  • Language
    Haskell
  • License
    GNU General Publi...
  • Created almost 15 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Haskell library for converting LaTeX math to MathML.

texmath

CI tests

texmath is a Haskell library for converting between formats used to represent mathematics. Currently it provides functions to read and write TeX math, presentation MathML, and OMML (Office Math Markup Language, used in Microsoft Office), and to write Gnu eqn, typst, and pandoc's native format (allowing conversion, using pandoc, to a variety of different markup formats). The TeX reader and writer supports basic LaTeX and AMS extensions, and it can parse and apply LaTeX macros. The package also includes several utility modules which may be useful for anyone looking to manipulate either TeX math or MathML. For example, a copy of the MathML operator dictionary is included.

You can try it out online here.

By default, only the Haskell library is installed. To install a test program, texmath, use the executable Cabal flag:

cabal install -fexecutable

By default, the executable will be installed in ~/.cabal/bin.

Alternatively, texmath can be installed using stack. Install the stack binary somewhere in your path. Then, in the texmath repository,

stack setup
stack install --flag texmath:executable

The texmath binary will be put in ~/.local/bin.

Macro definitions may be included before a LaTeX formula.

Running texmath as a server

texmath will behave as a CGI script when called under the name texmath-cgi (e.g. through a symbolic link). The file cgi/texmath.html contains an example of how it can be used.

But it is also possible to compile a full webserver with a JSON API. To do this, set the server cabal flag, e.g.

stack install --flag texmath:server

To run the server on port 3000:

texmath-server -p 3000

Sample of use, with httpie:

% http --verbose localhost:3000/convert text='2^2' from=tex to=mathml display:=false Accept:'text/plain'
POST /convert HTTP/1.1
Accept: text/plain
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 64
Content-Type: application/json
Host: localhost:3000
User-Agent: HTTPie/3.1.0

{
    "display": false,
    "from": "tex",
    "text": "2^2",
    "to": "mathml"
}


HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Date: Mon, 21 Mar 2022 18:29:26 GMT
Server: Warp/3.3.17
Transfer-Encoding: chunked

<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML">
  <msup>
    <mn>2</mn>
    <mn>2</mn>
  </msup>
</math>

Possible values for from are tex, mathml, and omml. Possible values for to are tex, mathml, omml, eqn, and pandoc (JSON-encoded Pandoc).

Alternatively, you can use the convert-batch endpoint to pass in a JSON-encoded list of conversions and get back a JSON-encoded list of results.

Generating lookup tables

There are three main lookup tables which are built form externally compiled lists. This section contains information about how to modify and regenerate these tables.

In the lib direction there are two sub-directories which contain the necessary files.

MMLDict.hs

The utility program xsltproc is required. You can find these files in lib/mmldict/

  1. If desired replace unicode.xml with and updated version (you can download a copy from here
  2. xsltproc -o dictionary.xml operatorDictionary.xsl unicode.xml
  3. runghc generateMMLDict.hs
  4. Replace the operator table at the bottom of src/Text/TeXMath/Readers/MathML/MMLDict.hs with the contents of mmldict.hs

ToTeXMath.hs

You can find these files in lib/totexmath/

  1. If desired, replace unimathsymbols.txt with an updated version from here
  2. runghc unicodetotex.hs
  3. Replace the record table at the bottom of src/Text/TeXMath/Unicode/ToTeXMath.hs with the contents of UnicodeToLaTeX.hs

ToUnicode.hs

You can find these files in lib/tounicode/.

  1. If desired, replace UnicodeData.txt with an updated verson from here.
  2. runghc mkUnicodeTable.hs
  3. Replace the table at the bottom of src/Text/TeXMath/Unicode/ToUnicode.hs with the output.

Editing the tables

It is not necessary to edit the source files to add records to the tables. To add to or modify a table it is easier to add modify either unicodetotex.hs or generateMMLDict.hs. This is easily achieved by adding an item to the corresponding updates lists. After making the changes, follow the above steps to regenerate the table.

The test suite

To run the test suite, do cabal test or stack test.

In its standard mode, the test suite will run golden tests of the individual readers and writers. Reader tests can be found in test/reader/{mml,omml,tex}, and writer tests in test/writer/{eqn,mml,omml,tex}. Regression tests linked to specific issues are in test/regression.

Each test file consists of an input and an expected output. The input begins after a line <<< FORMAT and the output begins after a line >>> FORMAT.

If many tests fail as a result of changes, but the test failures are all because of improvements in the output, you can pass --accept to the test suite (e.g., with --test-arguments=--accept on stack test), and the existing golden files will be overwritten. If you do this, inspect the outputs very carefully to make sure they are correct.

If you pass the --roundtrip option into the test suite (e.g., using --test-arguments=--roundtrip with stack test), round-trip tests will be run instead. Many of these will fail. In these tests, the native inputs in test/roundtrip/*.native will be converted to (respectively) mml, omml, or tex, then converted back, and the result will be compared with the starting point. Although we don't guarantee that this kind of round-trip transformation will be the identity, looking at cases where it fails can be a guide to improvements.

Authors

John MacFarlane wrote the original TeX reader, MathML writer, Eq writer, and OMML writer. Matthew Pickering contributed the MathML reader, the TeX writer, and many of the auxiliary modules. Jesse Rosenthal contributed the OMML reader. Thanks also to John Lenz for many contributions.

More Repositories

1

pandoc

Universal markup converter
Haskell
32,409
star
2

gitit

A wiki using HAppS, pandoc, and git
Haskell
2,126
star
3

djot

A light markup language
HTML
1,557
star
4

peg-markdown

An implementation of markdown in C, using a PEG grammar
C
686
star
5

pandocfilters

A python module for writing pandoc filters, with a collection of examples
Python
494
star
6

pandoc-templates

Templates for pandoc, tagged to release
HTML
418
star
7

yst

create static websites from YAML data and string templates
Haskell
373
star
8

pandoc-citeproc

Library and executable for using citeproc with pandoc
Haskell
288
star
9

lunamark

Lua library for conversion between markup formats
C
186
star
10

skylighting

A Haskell syntax highlighting library with tokenizers derived from KDE syntax highlighting descriptions
Haskell
185
star
11

citeproc

CSL citation processing library in Haskell
Haskell
138
star
12

commonmark-hs

Pure Haskell commonmark parsing library, designed to be flexible and extensible
Haskell
130
star
13

djot.js

JavaScript implementation of djot
TypeScript
120
star
14

highlighting-kate

A syntax highlighting library in Haskell, based on Kate syntax definitions
HTML
109
star
15

cheapskate

Experimental markdown processor in Haskell
HTML
105
star
16

pandoc-types

types for representing structured documents
Haskell
105
star
17

gitit2

A reimplementation of gitit in Yesod
Haskell
94
star
18

lcmark

Flexible CommonMark converter
Lua
54
star
19

doctemplates

Pandoc-compatible templating system
Haskell
49
star
20

zip-archive

Native Haskell library for working with zip archives
Haskell
44
star
21

cmark-hs

Haskell bindings to libcmark commonmark parser
C
43
star
22

djot.lua

Lua parser for the djot light markup language
Lua
39
star
23

typst-hs

Haskell library for parsing and evaluating typst
Haskell
32
star
24

dotvim

My vim configuration
Vim Script
30
star
25

scripts

A collection of small scripts to do various things
Shell
28
star
26

filestore

A versioning file store backed by git, darcs, or mercurial
Haskell
28
star
27

pandoc-website

Source files for pandoc's website
Lua
28
star
28

illuminate

An efficient syntax highlighting library in Haskell, using alex-generated lexers
Haskell
26
star
29

emojis

Haskell library for emojis
Haskell
25
star
30

markdown-peg

A Haskell implementation of markdown using a PEG grammar
Haskell
24
star
31

pandoc-server

Simple server app for pandoc conversions.
Haskell
20
star
32

doclayout

A prettyprinting library designed for laying out plain text documents
Haskell
20
star
33

standalone-html

Incorporates external dependencies into HTML file using data: URI scheme
Haskell
19
star
34

pandoc-tex2svg

Pandoc filter to convert math to SVG using MathJax-node's tex2svg
HTML
19
star
35

cloudlib

tools for keeping a library of books and articles on Amazon's S3 and SimpleDB
Ruby
19
star
36

cmark-lua

Lua bindings to libcmark CommonMark parser
C
17
star
37

HeX

a flexible text macro system
Haskell
17
star
38

djoths

Haskell parser for the djot light markup language
Haskell
17
star
39

unicode-collation

Haskell implementation of the Unicode Collation Algorithm
Haskell
16
star
40

sep-offprint

Creates formatted "offprints" of Stanford Encyclopedia of Philosophy entries.
15
star
41

BayHac2014

Slides for my presentation on pandoc at BayHac2014
TeX
14
star
42

cmarkpdf

Steps towards a PDF renderer for cmark using libharu
C
14
star
43

lunamark-standalone

Standalone version of lunamark (compiled with no library dependencies)
C
12
star
44

commonmarker

Ruby wrapper for libcmark (CommonMark parser)
Ruby
12
star
45

hsb2hs

Preprocessor for inserting literals with binary blobs into Haskell programs.
Haskell
11
star
46

ipynb

Data structures and JSON serializer/deserializer for Jupyter notebooks (.ipynb) format.
Jupyter Notebook
10
star
47

gogar

Computer implementation of Robert Brandom's "game of giving and asking for reasons," from Making It Explicit, chapter 3.
Ruby
10
star
48

emacsd

emacs configuration
Emacs Lisp
9
star
49

hscommonmark

pure Haskell CommonMark parser
Haskell
9
star
50

recaptcha

Haskell library for using the reCAPTCHA service
Haskell
8
star
51

select-meta

Pandoc lua filter for constructing metadata from YAML data sources using queries
Lua
8
star
52

html2cmark

Lua library to convert HTML5 to commonmark
Lua
8
star
53

citeproc-hs-bin

Command-line interface to the citeproc-hs CSL citation processing library
Haskell
8
star
54

grammata

Well-typed system for generating documents in multiple formats
Haskell
7
star
55

ecstatic

Static website management using tenjin templates and YAML data files
Ruby
7
star
56

hw2gitit

Script to convert haskellwiki pages to a gitit wiki
Haskell
7
star
57

hsgit

A higher-level interface to libgit2 functions than hlibgit2
Haskell
6
star
58

pandoc-highlight

Filter and library for using pandoc with highlighting-kate
Haskell
6
star
59

trypandoc

Live demo of pandoc
JavaScript
6
star
60

commonmark-lua

Lua binding to libcmark commonmark parser
Lua
5
star
61

rfc5051

Haskell implementation of RFC5051, simple unicode collation.
Haskell
5
star
62

jgm.github.com

jgm's web pages on github
4
star
63

rocks

luarocks repository
4
star
64

GHCUnicodeAlt

Improved version of GHC.Unicode, with benchmarks
Haskell
3
star
65

cmark-fuzz-data

A minimal fuzz test suite for cmark created by american fuzzy lop and afl-cmin
3
star
66

luacmark

Lua binding to CommonMark
C
2
star
67

typst-symbols

Defines symbols and emoji used in typst
Haskell
2
star