• Stars
    star
    118
  • Rank 299,923 (Top 6 %)
  • Language
    HTML
  • License
    MIT License
  • Created over 9 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Markdown-like DSL for defining grammatical syntax for programming languages.

grammarkdown

NPM version Build Status

Summary

grammarkdown is a markdown-style parser for syntactic grammars, designed to make it easily to rapidly prototype a grammar and statically verify its consistency. The grammar supported by grammarkdown is based on the parametric grammar used by ECMA-262 (the JavaScript language standard).

Usage

Syntax:                   grammarkdown [options] [...files]

Examples:                 grammarkdown es6.grammar
                          grammarkdown --out es6.md --format markdown es6.grammar

Options:
 -f, --format FORMAT      The output format.
 -h, --help               Prints this message.
     --noChecks           Does not perform static checking of the grammar.
     --noEmit             Does not emit output.
     --noEmitOnError      Does not emit output if there are errors.
 -o, --out FILE           Specify the output file.
 -v, --version            Prints the version.

Syntax

A grammarkdown grammar file uses significant whitespace in the form of line terminators and indentation. Tab (ASCII 0x9) characters are preferred, however using multiple spaces for indentation is supported as long as all nested elements have the same amount of leading whitespace.

Productions

A Production consists of a left-hand-side Nonterminal followed by a colon (:) separator and one or more right-hand-side sentences consisting of various forms of terminal and nonterminal symbols. For example:

NameSpaceImport : `*` `as` ImportedBinding

It is recommended that Productions should follow pascal-case naming conventions, to avoid collision with reserved keywords.

You may specify multiple productions for a Nonterminal on multiple lines, as follows:

NamedImports : `{` `}`
NamedImports : `{` ImportList `}`
NamedImports : `{` ImportList `,` `}`

You may also specify multiple right-hand-side sentences for a single production by indenting them:

NamedImports :
    `{` `}`
    `{` ImportList `}`
    `{` ImportList `,` `}`

A Production may specify one or more parameters that can be used to reuse a Nonterminal in various circumstances:

IdentifierReference[Yield] :
    Identifier
    [~Yield] `yield`

A Production may also specify a limited set of terminals, by using the one of keyphrase:

Keyword :: one of
	`break`		`do`		`in`			`typeof`
	`case`		`else`		`instanceof`	`var`
	`catch`		`export`	`new`			`void`
	`class`		`extends`	`return`		`while`
	`const`		`finally`	`super`			`with`
	`continue`	`for`		`switch`		`yield`
	`debugger`	`function`	`this`
	`default`	`if`		`throw`
	`delete`	`import`	`try`

Parameters

If a Nonterminal on the right-hand-side of a production needs to set a parameter, they supply it in an argument list. Supplying the name of the argument sets the parameter, prefixing the name with a question mark ('?) passes the current value of the parameter, and eliding the argument clears the parameter:

Declaration[Yield] :
	HoistableDeclaration[?Yield]
	ClassDeclaration[?Yield]
	LexicalDeclaration[In, ?Yield]

The right-hand-side of a Production consists of one or more Terminal or Nonterminal symbols, a sentence of Prose, or an Assertion.

Terminals

A Terminal symbol can be one of the following:

  • A literal string of one or more characters enclosed in backticks ('`'). For example: `function`
  • A sequence of three backtick characters, which denotes a backtick token. For example: ```
  • A unicode character literal enclosed in a leading less-than ('<') character and a trailing greater-than ('>') character. For example: <TAB>

Nonterminals

A Nonterminal symbol is an identifier, followed by an optional argument list, and an optional question mark ('?'). The question mark changes the cardinality of the Nonterminal from "exactly one" to "zero or one". The identifier may optionally be enclosed in | characters, if it happens to collide with a keyword.

Character Literals and Ranges

Character literals may be specified using one of the following forms:

  • A Unicode code point, of the form U+ followed by four to six non-lowercase hexadecimal digits with no leading zeros other than those necessary for padding to a minimum of four digits, in accordance with The Unicode Standard, Version 15.0.0, Appendix A, Notational Conventions (i.e., matching Unicode extended BNF pattern "U+" ( [1-9 A-F] | "10" )? H H H H or regular expression pattern ^U[+]([1-9A-F]|10)?[0-9A-F]{4}$ as in U+00A0 or U+1D306).
  • The preceding representation followed by a space and a printable ASCII prose explanation (such as a character name) free of < and > and line terminators, all wrapped in < and > (i.e., matching Unicode extended BNF pattern "<" "U+" ( [1-9 A-F] | "10" )? H H H H " " [\u0020-\u007E -- [<>]]+ ">" or regular expression pattern ^<U[+]([1-9A-F]|10)?[0-9A-F]{4} [\x20-\x3b\x3d\x3f-\x7e]+>$ as in <U+2212 MINUS SIGN>)
  • An abbreviation defined somewhere outside the grammar as an ASCII identifier name, wrapped in < and > (i.e., matching Unicode extended BNF pattern "<" [A-Z a-z _] [A-Z a-z _ 0-9]* ">" or regular expression pattern ^<[A-Za-z_][A-Za-z_0-9]*$> as in <NBSP>).

Character ranges may be specified using the through keyword:

    SourceCharacter but not one of `"` or `\` or U+0000 through U+001F

Prose

A sentence of Prose is a single line with a leading greater-than ('>') character. For example: > any Unicode code point

The but not Condition

The but not condition allows you to reference a production, excluding some part of that production. For example:

MultiLineNotAsteriskChar ::
	SourceCharacter but not `*`

Here, MultiLineNotAsteriskChar may contain any alternative from SourceCharacter, except the terminal `*`.

The one of Condition

You can exclude multiple alternatives by including a list of symbols to exclude through the use of the one of keyphrase. Each entry in the list is separated by or:

MultiLineNotForwardSlashOrAsteriskChar ::
	SourceCharacter but not one of `/` or `*`

Assertions

An Assertion is a zero-width test that must evaluate successfully for the Production to be considered. Assertions are enclosed in a leading open bracket ('[') character and a trailing close-bracket (']') character.

The possible assertions include:

  • The empty assertion, which matches exactly zero tokens: [empty]
  • The lookahead assertion, which verifies the next tokens in the stream: [lookahead != `function`]
  • The no-symbol-here assertion, which verifies the next token is not the provided symbol: [no LineTerminator here]
  • The lexical-goal assertion, which states that the current lexical goal is the supplied Nonterminal: [lexical goal InputElementRegExp]
  • The parameter assertion, which states the supplied parameter to the current production is either set (using the plus ('+') character), or cleared (using the tilde ('~') character): [~Yield] `yield`
  • The prose assertion, which allows for arbitrary prose, mixed with terminals and nonterminals: [> prose text `terminal` prose text |NonTerminal| prose text]

A lookahead assertion has the following operators:

  • The == operator states the lookahead phrase is matched: [lookahead == `class`]
  • The != operator states the lookahead phrase is not matched: [lookahead != `function`]
  • The <- operator states that any matching phrase in the provided set is matched: [lookahead <- { `public`, `private` }]
  • The <! operator states that any matching phrase in the provided set is not matched: [lookahead <! { `{`, `function` }]

Linking

During emit, grammarkdown implicitly adds a generated name for each Production and Right-hand side that can be used to link directly to the production using a URI fragment. You can explicitly set the name for a production by tagging it with a custom link name:

Declaration[Yield] :
	HoistableDeclaration[?Yield]       #declaration-hoistable
	ClassDeclaration[?Yield]           #declaration-class
	LexicalDeclaration[In, ?Yield]     #declaration-lexical

Comments

You can also annotate your grammar with C-style single-line and multi-line comments.

Examples

For comprehensive examples of grammarkdown syntax and output, you can review the following samples:

API

grammarkdown has an API that can be consumed:

var grammarkdown = require("grammarkdown")
  , Grammar = grammarkdown.Grammar
  , EmitFormat = grammarkdown.EmitFormat

var filename = "...";
var source = "...";
var output;

// parse
var grammar = new Grammar(
  [filename],
  { format: EmitFormat.markdown },
  function () { return source; });

// bind (optional, bind happens automatically during check)
grammar.bind();

// check (optional, check happens automatically during emit)
grammar.check();

// emit
grammar.emit(undefined, function (file, text) { output = text; });

console.log(output);

Related

More Repositories

1

reflect-metadata

Prototype for a Metadata Reflection API for ECMAScript
TypeScript
2,999
star
2

proposal-enum

Proposal for ECMAScript enums
HTML
214
star
3

prex

Async coordination primitives and extensions on top of ES6 Promises
TypeScript
149
star
4

iterable-query

Query API over JavaScript (ECMAScript) Iterators
TypeScript
66
star
5

proposal-shorthand-improvements

A proposal to introduce new shorthand assignment forms for ECMAScript object literals
HTML
64
star
6

obs-remote

A touch-friendly desktop application designed for tablets that can be used to control OBS Studio remotely using 'obs-websocket'
TypeScript
36
star
7

proposal-functional-operators

Proposal for the additon of functional operators to ECMAScript
HTML
23
star
8

proposal-regexp-features

Proposal to investigate additional language features for ECMAScript Regular Expressions
JavaScript
19
star
9

asyncjs

Asynchronous coordination primatives for TypeScript and JavaScript
JavaScript
13
star
10

proposal-statements-as-expressions

Proposal to explore Statements as Expressions
JavaScript
12
star
11

proposal-struct

Value Types (i.e., 'struct') for ECMAScript
JavaScript
12
star
12

regexp-features

A comparison of Regular Expression features in various languages and libraries.
JavaScript
9
star
13

tsserver-live-reload

VSCode Extension to automatically restart the TypeScript Language Server when it changes.
TypeScript
8
star
14

graphmodel

JavaScript library for modeling directed graphs
TypeScript
7
star
15

sourcemap-visualizer

Source Map parser and visualizer
TypeScript
6
star
16

service-composition

Decorator-based dependency injection library.
TypeScript
5
star
17

iterable-query-linq

LINQ-like syntax via ECMAScript tagged templates
TypeScript
5
star
18

regexp-match-indices

Polyfill for the RegExp Match Indices proposal
TypeScript
4
star
19

posh-vsdev

Configures PowerShell to run as a Visual Studio Developer Command Prompt
PowerShell
4
star
20

promisejs

Promise/Future-based asynchrony in javascript
JavaScript
3
star
21

equatable

A low-level API for defining equality.
JavaScript
3
star
22

typedoc-plugin-linkrewriter

A TypeDoc plugin for rewriting links in markdown
TypeScript
2
star
23

chardata

Unicode character data
TypeScript
2
star
24

ecmarkup-vscode

ecmarkup language extensions for Visual Studio Code
HTML
2
star
25

fork-pipe

Fork/join stream processing.
TypeScript
2
star
26

chardata-generator

Generates TypeScript source files from the Unicode Character Database
TypeScript
1
star
27

typedoc-plugin-biblio

A TypeDoc plugin to support references to externally hosted documentation.
TypeScript
1
star
28

grammarkdown-server

Grammarkdown Language Server for VSCode
TypeScript
1
star
29

collection-core

Symbol-based augmentation API for interacting with collection-types in ECMAScript.
JavaScript
1
star
30

decorators

Annotations+Decorators TC39 proposal
HTML
1
star
31

rbuckton.github.io

1
star
32

grammarkdown-syntax

Grammarkdown Language Client for VSCode
JavaScript
1
star
33

proposal-regexp-conditionals

JavaScript
1
star
34

ecmascript-mirrors

Prototype and Proposal for a Mirrors API for ECMAScript, for use in tandem with Decorators.
TypeScript
1
star