• Stars
    star
    143
  • Rank 255,521 (Top 6 %)
  • Language
    Python
  • License
    Other
  • Created over 12 years ago
  • Updated over 10 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Regular expressions for objects

REfO

Lacking a proper name, REfO stands for "Regular Expressions for Objects".

It's a python library that supplies a functionality very similar to the python re module (regular expressions) but for arbitrary sequences of objects instead of strings (sequences of characters).

In addition to that, it's possible to match each object in a sequence with not only equality, but an arbitrary python function. For example, if you have a sequence of integers you can make a regular expression that asks for a even number followed by a prime number followed by a 3-divisible number.

This software was written by Rafael Carrascosa while working at Machinalis in the first months of 2012.

Contact: [email protected] or rafacarrascosa xyz gmail.com (replace " xyz " with "@")

Build Status

How to use it

The syntax is a little bit different than python's re, and similar to that of pyparsing, you have to more-or-less explicitly build the syntax tree of your regular expression. For instance:

"ab" is Literal("a") + Literal("b")

"a*" is Star(Literal("a"))

"(ab)+|(bb)*?" is:

a = Literal("a")
b = Literal("b")
regex = Plus(a + b) | Star(b + b, greedy=False)

You can also assign a group to any sub-match and later on retrieve the matched content, for instance:

regex = Group(Plus(a + b), "foobar")  | (b + b)
m = match(regex, "abab")
print m.span("foobar")  # prints (0, 4)

For more, check out the examples in the examples folder.

How we use it

At Machinalis we use REfO for applications similar to that in examples/words.py, check it out!

About the implementation

I use a Thompson-like virtual machine aproach, which ensures polynomial time worst-case complexity. See examples/poly_time.py for an example of this.

The implementation is heavily based on Russ Cox notes, see http://swtch.com/~rsc/regexp/regexp2.html for the source.

If you go to read the code, some glossary:

  • RE -- regular expression
  • VM -- virtual machine
  • Epsilon transitions -- All VM instructions that do not consume a symbol or stop the thread (for example an Accept).

Acknowledgements

Thanks Russ Cox for sharing the awesome info and insights on your web site.

Thanks Javier Mansilla for reviewing the code and being enthusiastic about it.

Thanks Machinalis for everything :)

More Repositories

1

quepy

A python framework to transform natural language questions to queries in a database query language.
Python
1,254
star
2

iepy

Information Extraction in Python
Python
905
star
3

featureforge

A set of tools for creating and testing machine learning features, with a scikit-learn compatible API
Python
381
star
4

mypy-django

PEP-484 type hints bindings for the Django web framework
Python
223
star
5

telegraphy

Telegraphy provides real time events for WSGI Python applications
JavaScript
202
star
6

yalign

A sentence aligner for comparable corpora
Python
127
star
7

satimg

Satellite data processing experiments
Jupyter Notebook
117
star
8

mypy-data

mypy typesheds for the Python data stack
Python
86
star
9

bidderd

RTBKIT Agent using Go and the HTTPInterface
Go
45
star
10

django-i18n-helper

Python
35
star
11

django-fasttest

A variant on django.test.TestCase optimized for postgres
Python
21
star
12

slides

Public talks by Machinalis
TeX
18
star
13

django-template-previewer

A Django app to allow developers preview templates
Python
17
star
14

mypy-django-example

A usage example for mypy-django
Python
15
star
15

django-test-autocomplete

Python
12
star
16

eff

Time tracking and report generation
Python
9
star
17

ninja-django-plugin

Django plugin for Ninja-IDE
Python
4
star
18

inventor

Inventor a very simple django based inventory system.
HTML
3
star
19

protobuf-python3

Google protobuf port to python3
C++
2
star
20

jquery_simple_progressbar

2
star
21

django-migration-tools

Scripts for helping with routine tasks while migration from 0.96 django versions to 1.x
Python
2
star
22

code_time_tracker

Python
1
star
23

ninja_ipython_console

An IPython console plugin for Ninja
Python
1
star
24

machinalis-movie-reviews

Python
1
star
25

alfajor

A site to collect shopping orders for packages of items, designed for an alfajor seller
Python
1
star