• Stars
    star
    128
  • Rank 281,044 (Top 6 %)
  • Language
    Ruby
  • License
    BSD 3-Clause "New...
  • Created over 14 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Ruby bindings to RE2, a "fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python".

re2 Build Status

A Ruby binding to re2, an "efficient, principled regular expression library".

Current version: 1.6.0
Supported Ruby versions: 1.8.7, 1.9.3, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2
Supported re2 versions: libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01), libre2.10 (2022-12-01)

Installation

You will need re2 installed as well as a C++ compiler such as gcc (on Debian and Ubuntu, this is provided by the build-essential package). If you are using Mac OS X, I recommend installing re2 with Homebrew by running the following:

$ brew install re2

If you are using Debian, you can install the libre2-dev package like so:

$ sudo apt-get install libre2-dev

Recent versions of re2 require a compiler with C++11 support such as clang 3.4 or gcc 4.8.

If you are using a packaged Ruby distribution, make sure you also have the Ruby header files installed such as those provided by the ruby-dev package on Debian and Ubuntu.

You can then install the library via RubyGems with gem install re2 or gem install re2 -- --with-re2-dir=/path/to/re2/prefix if re2 is not installed in any of the following default locations:

  • /usr/local
  • /opt/homebrew
  • /usr

Documentation

Full documentation automatically generated from the latest version is available at http://mudge.name/re2/.

Note that re2's regular expression syntax differs from PCRE and Ruby's built-in Regexp library, see the official syntax page for more details.

Usage

While re2 uses the same naming scheme as Ruby's built-in regular expression library (with Regexp and MatchData), its API is slightly different:

$ irb -rubygems
> require 're2'
> r = RE2::Regexp.new('w(\d)(\d+)')
=> #<RE2::Regexp /w(\d)(\d+)/>
> m = r.match("w1234")
=> #<RE2::MatchData "w1234" 1:"1" 2:"234">
> m[1]
=> "1"
> m.string
=> "w1234"
> m.begin(1)
=> 1
> m.end(1)
=> 2
> r =~ "w1234"
=> true
> r !~ "bob"
=> true
> r.match("bob")
=> nil

As RE2::Regexp.new (or RE2::Regexp.compile) can be quite verbose, a helper method has been defined against Kernel so you can use a shorter version to create regular expressions:

> RE2('(\d+)')
=> #<RE2::Regexp /(\d+)/>

Note the use of single quotes as double quotes will interpret \d as d as in the following example:

> RE2("(\d+)")
=> #<RE2::Regexp /(d+)/>

As of 0.3.0, you can use named groups:

> r = RE2::Regexp.new('(?P<name>\w+) (?P<age>\d+)')
=> #<RE2::Regexp /(?P<name>\w+) (?P<age>\d+)/>
> m = r.match("Bob 40")
=> #<RE2::MatchData "Bob 40" 1:"Bob" 2:"40">
> m[:name]
=> "Bob"
> m["age"]
=> "40"

As of 0.6.0, you can use RE2::Regexp#scan to incrementally scan text for matches (similar in purpose to Ruby's String#scan). Calling scan will return an RE2::Scanner which is enumerable meaning you can use each to iterate through the matches (and even use Enumerator::Lazy):

re = RE2('(\w+)')
scanner = re.scan("It is a truth universally acknowledged")
scanner.each do |match|
  puts match
end

scanner.rewind

enum = scanner.to_enum
enum.next #=> ["It"]
enum.next #=> ["is"]

As of 1.5.0, you can use RE2::Set to match multiple patterns against a string. Calling RE2::Set#add with a pattern will return an integer index of the pattern. After all patterns have been added, the set can be compiled using RE2::Set#compile, and then RE2::Set#match will return an Array<Integer> containing the indices of all the patterns that matched.

set = RE2::Set.new
set.add("abc") #=> 0
set.add("def") #=> 1
set.add("ghi") #=> 2
set.compile #=> true
set.match("abcdefghi") #=> [0, 1, 2]
set.match("ghidefabc") #=> [2, 1, 0]

As of 1.6.0, you can use Ruby's pattern matching against RE2::MatchData with both array patterns and hash patterns:

case RE2('(\w+) (\d+)').match("Alice 42")
in [name, age]
  puts "My name is #{name} and I am #{age} years old"
else
  puts "No match!"
end
# My name is Alice and I am 42 years old


case RE2('(?P<name>\w+) (?P<age>\d+)').match("Alice 42")
in {name:, age:}
  puts "My name is #{name} and I am #{age} years old"
else
  puts "No match!"
end
# My name is Alice and I am 42 years old

Features

  • Pre-compiling regular expressions with RE2::Regexp.new(re), RE2::Regexp.compile(re) or RE2(re) (including specifying options, e.g. RE2::Regexp.new("pattern", :case_sensitive => false)

  • Extracting matches with re2.match(text) (and an exact number of matches with re2.match(text, number_of_matches) such as re2.match("123-234", 2))

  • Extracting matches by name (both with strings and symbols)

  • Checking for matches with re2 =~ text, re2 === text (for use in case statements) and re2 !~ text

  • Incrementally scanning text with re2.scan(text)

  • Search a collection of patterns simultaneously with RE2::Set

  • Checking regular expression compilation with re2.ok?, re2.error and re2.error_arg

  • Checking regular expression "cost" with re2.program_size

  • Checking the options for an expression with re2.options or individually with re2.case_sensitive?

  • Performing a single string replacement with pattern.replace(replacement, original)

  • Performing a global string replacement with pattern.replace_all(replacement, original)

  • Escaping regular expressions with RE2.escape(unquoted) and RE2.quote(unquoted)

  • Pattern matching with RE2::MatchData

Contributions

  • Thanks to Jason Woods who contributed the original implementations of RE2::MatchData#begin and RE2::MatchData#end;
  • Thanks to Stefano Rivera who first contributed C++11 support;
  • Thanks to Stan Hu for reporting a bug with empty patterns and RE2::Regexp#scan;
  • Thanks to Sebastian Reitenbach for reporting the deprecation and removal of the utf8 encoding option in re2;
  • Thanks to Sergio Medina for reporting a bug when using RE2::Scanner#scan with an invalid regular expression;
  • Thanks to Pritam Baral for contributed the initial support for RE2::Set.

Contact

All issues and suggestions should go to GitHub Issues.

More Repositories

1

jquery_example

jQuery plugin to populate form inputs with example text that disappears on focus. See also http://github.com/mudge/jquery_placeholder
JavaScript
107
star
2

pacta

An algebraic implementation of ECMAScript 2015 and Promises/A+ Promises in JavaScript for as many browsers and Node.js versions as possible
JavaScript
71
star
3

riveted

A Clojure library for the fast processing of XML with VTD-XML.
Clojure
25
star
4

comp

A Ruby library to add function composition to Procs and Methods.
Ruby
24
star
5

python-delicious

A Python module to access del.icio.us via its API.
Python
23
star
6

jquery_placeholder

A jQuery plugin to support HTML5's placeholder attribute in older browsers.
JavaScript
15
star
7

homer

A lightweight DNS-over-HTTPS ("DOH") proxy written in Rust.
Rust
15
star
8

runspec.vim

A Vim plugin to run specs for the current file.
Ruby
12
star
9

fast_sessions

A fork of the Fast Rails Sessions plugin that is compatible with Rails 2.3.
Ruby
11
star
10

puppet-workstation

Attempting to automate the installation and configuration of my personal laptop with Puppet.
Ruby
9
star
11

php-clj

Deserialize PHP into Clojure data structures and back again.
Clojure
8
star
12

php-microkanren

A PHP implementation of ฮผKanren.
PHP
8
star
13

oplog

A Rust library for iterating over a MongoDB replica set oplog.
Rust
7
star
14

fibonacci_heap

A Ruby implementation of the Fibonacci heap data structure.
Ruby
7
star
15

if

Implementing if in Ruby without using keywords.
Ruby
6
star
16

managing_web_application_servers_with_puppet

My presentation about Puppet for LRUG August 2011.
Shell
6
star
17

asset_compressor

A Rails plugin to compress stylesheets and JavaScript with the YUI Compressor.
Ruby
6
star
18

puppet-pkgin

A Puppet package provider for pkgin, a binary package manager for pkgsrc.
Ruby
5
star
19

atomic_page_caching

Rails plugin to add an atomic page caching method for ActionController.
Ruby
5
star
20

title_case

A Ruby implementation of John Gruber's Title Case.
Ruby
5
star
21

dotfiles

My various configuration files (.vimrc, etc.)
Vim Script
4
star
22

collapsing-puzzle

A simple Java puzzle game written in 2004 for a university project.
Java
4
star
23

blankable

A Ruby mixin to determine whether an object's values are blank with examples for Arrays and Hashes.
Ruby
4
star
24

jquery_clear

A simple jQuery plugin to clear all types of form inputs.
JavaScript
3
star
25

mudge.github.com

The source code for my blog.
JavaScript
3
star
26

tyger

A stylesheet for NetNewsWire (currently bundled with the app itself).
2
star
27

hubot-codenames

Suggesting solid gold, business appropriate names since 2013.
CoffeeScript
2
star
28

lego_bulldozer

An NQC program for the Lego RCX brick written for a course in 2004.
2
star
29

caesar.vim

A Vim plugin to convert numbers to Roman numerals used to demonstrate testing Vim script during Vim London, February 2013.
Ruby
2
star
30

mudgel

A toy programming language and an implementation of FizzBuzz
Ruby
2
star
31

foreclojure-downloader

A Clojure library to download 4clojure problems for offline work.
Clojure
1
star
32

readable

A generic way to create IO-like objects from any source
Ruby
1
star
33

prawn

A minimal Rails 2.3 plugin to provide a template handler for the Prawn PDF library.
Ruby
1
star
34

oplogjam

An experiment in writing a "safe" MongoDB oplog tailer in Ruby.
Ruby
1
star
35

new-turing-omnibus

Experiments while reading The New Turing Omnibus.
Ruby
1
star
36

nand2tetris

On-going solutions to the "From NAND to Tetris" exercises
Assembly
1
star
37

re2-test-action

GitHub Action to install Ruby and libre2-dev before running the re2 test suite.
Shell
1
star
38

configuration_management_with_puppet

My presentation for the first NTP Lightning Talks in April 2011
JavaScript
1
star
39

tapl

Bits & pieces as I read Benjamin C. Pierce's "Types and Programming Languages"
Rust
1
star
40

tableau

A pastie application written with Sinatra and Sequel.
Ruby
1
star
41

helloredis

An incomplete Ruby FFI interface to hiredis for educational purposes.
Ruby
1
star
42

sentient-isbn

An implementation of International Standard Book Numbers (both 10 and 13 digits) supporting conversion & correction using Sentient.
JavaScript
1
star
43

padding-oracles

A practical example of a padding oracle in Ruby
Ruby
1
star
44

spendthrift

A standalone iPhone web app for budgeting (currently an unstable work in progress).
JavaScript
1
star