• Stars
    star
    169
  • Rank 216,458 (Top 5 %)
  • Language
    Ruby
  • License
    BSD 3-Clause "New...
  • Created about 13 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A simple PEG library for ruby

kpegΒΆ ↑

home

github.com/evanphx/kpeg

bugs

github.com/evanphx/kpeg/issues

DescriptionΒΆ ↑

KPeg is a simple PEG library for Ruby. It provides an API as well as native grammar to build the grammar.

KPeg strives to provide a simple, powerful API without being too exotic.

KPeg supports direct left recursion of rules via the OMeta memoization trick.

Writing your first grammarΒΆ ↑

Setting up your grammarΒΆ ↑

All grammars start with with the class/module name that will be your parser

%% name = Example::Parser

After that a block of ruby code can be defined that will be added into the class body of your parser. Attributes that are defined in this block can be accessed within your parser as instance variables. Methods can also be defined in this block and used in action blocks as well.

%% {
  attr_accessor :something_cool

  def something_awesome
    # do something awesome
  end
}

Defining literalsΒΆ ↑

Literals are static declarations of characters or regular expressions designed for reuse in the grammar. These can be constants or variables. Literals can take strings, regular expressions or character ranges

ALPHA = /[A-Za-z]/
DIGIT = /[0-9]/
period = "."
string = "a string"
regex = /(regexs?)+/
char_range = [b-t]

Literals can also accept multiple definitions

vowel = "a" | "e" | "i" | "o" | "u"
alpha = /[A-Z]/ | /[a-z]/

Defining Rules for ValuesΒΆ ↑

Before you can start parsing a string you will need to define rules that you will use to accept or reject that string. There are many different types of rules available in kpeg

The most basic of these rules is a string capture

alpha = < /[A-Za-z]/ > { text }

While this looks very much like the ALPHA literal defined above it differs in one important way, the text captured by the rule defined between the < and > symbols will be set as the text variable in block that follows. You can also explicitly define the variable that you would like but only with existing rules or literals.

letter = alpha:a { a }

Additionally blocks can return true or false values based upon an expression within the block. To return true if a test passes do the following:

match_greater_than_10 = < num:n > &{ n > 10 }

To test and return a false value if the test passes do the following:

do_not_match_greater_than_10 = < num:n > !{ n > 10 }

Rules can also act like functions and take parameters. An example of this is lifted from the Email List Validator, where an ascii value is passed in and the character is evaluated against it returning a true if it matches

d(num) = <.> &{ text[0] == num }

Rules support some regular expression syntax for matching

  • maybe ?

  • many +

  • kleene *

  • groupings ()

Examples:

letters = alpha+
words = alpha+ space* period?
sentence = (letters+ | space+)+

Kpeg also allows a rule to define the acceptable number of matches in the form of a range. In regular expressions this is often denoted with syntax like {0,3}. Kpeg uses this syntax to accomplish match ranges [min, max].

matches_3_to_5_times = letter[3,5]
matches_3_to_any_times = letter[3,*]

Defining ActionsΒΆ ↑

Illustrated above in some of the examples, kpeg allows you to perform actions based upon a match that are described in block provided or in the rule definition itself.

num = /[1-9][0-9]*/
sum = < num:n1 "+" num:n2 > { n1 + n2 }

As of version 0.8 an alternate syntax has been added for calling defined methods as actions.

%% {
  def add(n1, n2){
    n1 + n2
  }
}
num = /[1-9][0-9]*/
sum = < num:n1 "+" num:n2 > ~add(n1, n2)

Referencing an external grammarΒΆ ↑

Kpeg allows you to run a rule that is defined in an external grammar. This is useful if there is a defined set of rules that you would like to reuse in another parser. To do this, create your grammar and generate a parser using the kpeg command line tool.

kpeg literals.kpeg

Once you have the generated parser, include that file into your new grammar

%{
  require "literals.kpeg.rb"
}

Then create a variable to hold to foreign interface and pass it the class name of your parser. In this case my parser class name is Literal

%foreign_grammar = Literal

You can then use rules defined in the foreign grammar in the local grammar file like so

sentence = (%foreign_grammar.alpha %foreign_grammar.space*)+
           %foreign_grammar.period

CommentsΒΆ ↑

Kpeg allows comments to be added to the grammar file by using the # symbol

# This is a comment in my grammar

VariablesΒΆ ↑

A variable looks like this:

%% name = value

Kpeg allows the following variables that control the output parser:

name

The class name of the generated parser.

custom_initialize

When built as a standalone parser a default initialize method will not be included.

DirectivesΒΆ ↑

A directive looks like this:

%% header {
  ...
}

Kpeg allows the following directives:

header

Placed before any generated code

pre-class

Placed before the class definition to provide a class comment

footer

Placed after the end of the class (for requiring files dependent upon the parser’s namespace

Generating and running your parserΒΆ ↑

Before you can generate your parser you will need to define a root rule. This will be the first rule run against the string provided to the parser

root = sentence

To generate the parser run the kpeg command with the kpeg file(s) as an argument. This will generate a ruby file with the same name as your grammar file.

kpeg example.kpeg

Include your generated parser file into an application that you want to use the parser in and run it. Create a new instance of the parser and pass in the string you want to evaluate. When parse is called on the parser instance it will return a true if the sting is matched, or false if it doesn’t.

require "example.kpeg.rb"

parser = Example::Parser.new(string_to_evaluate)
parser.parse

Shortcuts and other techniquesΒΆ ↑

Per vito, you can get the current line or current column in the following way

line = { current_line }
column = { current_column }
foo = line:line ... { # use line here }

AST GenerationΒΆ ↑

As of Kpeg 0.8 a parser can now generate an AST. To define an AST node use the following syntax

%% assign = ast Assignment(name, value)

Once you have a defined AST node, it can be used in your grammar like so

assignment = identifier:i space* = space* value:v ~assign(i,v)

This will create a new Assign node that you can add into your AST.

For a good example of usage check out Talon

ExamplesΒΆ ↑

There are several examples available in the /examples directory. The upper parser has a readme with a step by step description of the grammar.

ProjectsΒΆ ↑

Dang

Email Address Validator

Callisto

Doodle

Kanbanpad (uses kpeg for parsing of the β€˜enter something’ bar)

More Repositories

1

benchmark-ips

Provides iteration per second benchmarking for Ruby
Ruby
1,695
star
2

json-patch

A Go library to apply RFC6902 patches and create and apply RFC7386 patches
Go
973
star
3

gx

A set of git tools
Ruby
129
star
4

alexa

Golang interface to the Amazon Alexa Voice service
Go
83
star
5

newrelic-redis

NewRelic instrumentation for redis
Ruby
77
star
6

wildcat

A golang zero-allocation HTTP parser (and eventually http server)
Go
62
star
7

eventd-rfc

A RFC of a syslog replacement
Protocol Buffer
54
star
8

stark

Optimized thrift bindings for ruby
Ruby
52
star
9

benchmark_suite

A set of enhancements to benchmark.rb
Ruby
44
star
10

gemjour

Serve and install gems over Bonjour
Ruby
32
star
11

lost

A ruby wrapper for CoreLocation
Objective-C
29
star
12

ulysses

A thin OS for application goodness
C
28
star
13

distance_between

A RubyMotion App that uses calculates the distance between 2 locations
Ruby
24
star
14

Gauge

A live status viewer for Rubinius
Ruby
22
star
15

prattle

A simple smalltalk frontend to Rubinius
Ruby
21
star
16

orthrus-ssh

A user authentication system built on SSH's key
Ruby
21
star
17

talon

A syntax engine for a series of languages
Ruby
19
star
18

hear

A PortAudio + GCP Speech2text golang library
Go
16
star
19

marius

A dynamic language experiment with a fun VM
C++
14
star
20

heap_dump

Code to read Rubinius HeapDump format
Ruby
13
star
21

ssh

Fork of go's ssh lib
Go
12
star
22

remoteenv

A POC for patching getenv to fetch values from Consul
C
11
star
23

wal

A WAL primitive for Golang
Go
10
star
24

yoke

A VirtualBox fork for the modern world
C
10
star
25

mesh

A Peer to Peer networking package for Go
Go
9
star
26

columbia

WebAssembly based Linux compatible Runtime
WebAssembly
9
star
27

irccat

irccat is like `cat`, but here, the STDOUT is an IRC channel.
Ruby
9
star
28

harq

A simple, high speed message queue with optional message durability
C++
9
star
29

benchmark.fyi

A place to share benchmarking results
Ruby
9
star
30

webui

A platform independent wrapper for creating applications using webviews
C
9
star
31

schubert

A simple systems/configuration management idea
Ruby
8
star
32

go-secretly

A package for storing and retrieving secrets from files, Vault, AWS Parameter Storage, (etc?)
Go
7
star
33

puma-heroku

Puma plugin for easy integration with Heroku
Ruby
7
star
34

marlowe

A language experiment
Ruby
7
star
35

schain

An alternative to envchain that is cross platform
Go
6
star
36

osx-notify

A tiny, ruby like, wrapper for OS X's notifications
Ruby
6
star
37

go-hclog-slog

An adapter from hclog to log/slog
Go
6
star
38

mesh-vpn

Go
5
star
39

redsun

Ruby
5
star
40

inspeqtor

A older checkout of @mperham's inspeqtor to use for Monit's DMCA comparison.
Go
4
star
41

zodiac-prime

A RAFT consensus implementation
Ruby
4
star
42

m13

A dynamic language experiment
Go
4
star
43

rubygems_fp

Rubygems Future Proof APIs
Ruby
4
star
44

lights

Control Phillips Hue lights in Go
Go
4
star
45

ulysses-libc

The libc to go along with the ulysses kernel
C
4
star
46

rivetdb

A simple key/value database backed by log merging
Go
4
star
47

evanphx.github.com

My page
3
star
48

hclogr

Adapter for hclog to the logr protocol
Go
3
star
49

stark-rack

A rack middleware for thrift services
Ruby
3
star
50

party.to

Ruby
3
star
51

dotfiles

Various configuration files
Vim Script
3
star
52

sync

Docker image to sync between 2 directories
Go
3
star
53

go-crypto-dh

Diffie-Hellman algorithm for Go
Go
3
star
54

on_fork

Manager of code to run when a Ruby process forks
Ruby
2
star
55

w3c-css

A CSS spec compliant parser
Ruby
2
star
56

securetunnel

Go
2
star
57

callbox

A twilio app to control my callbox
2
star
58

opaqueany

Go
2
star
59

stark-http

Thrift client protocol for accessing thrift APIs over HTTP
Ruby
2
star
60

evanphx.github.io

blog!
HTML
1
star
61

ficus

LLVM + Lisp!
C++
1
star
62

party.to-website

The party.to website
1
star
63

mesh-shell

Go
1
star
64

pegdown

A markdown parser
Ruby
1
star
65

xlr8r

A turbocharger for ruby 1.8
C
1
star
66

rubinius-website

Rubinius Website
JavaScript
1
star
67

yfs

A transactional deduping, compressing, encrypting filesystem-like package
Go
1
star
68

kids-pics

A Toshiba Flashair app and go server
Lua
1
star
69

pubuser

Go library for fetching basic user and ssh key information from services
Go
1
star
70

tfe-emp-dev

Terraform Enterprise setup for AWS
HCL
1
star
71

blog

The source for my blog
HTML
1
star
72

tree-sitter-hardlight

C
1
star
73

ffi

Ruby FFI
C
1
star
74

rubygems-lazymirror

A rack app to lazily mirror rubygems infrastructure
Ruby
1
star