• Stars
    star
    126
  • Rank 284,543 (Top 6 %)
  • Language
    Haskell
  • License
    BSD 3-Clause "New...
  • Created over 5 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Text lenses using PCRE regexes

lens-regex-pcre

Hackage and Docs

Based on pcre-heavy; so it should support any regexes or options which it supports.

Performance is equal, sometimes better than that of pcre-heavy alone.

Which module should you use?

If you need unicode support, use Control.Lens.Regex.Text, if not then Control.Lens.Regex.ByteString is faster.

Working with Regexes in Haskell kinda sucks; it's tough to figure out which libs to use, and even after you pick one it's tough to figure out how to use it; lens-regex-pcre hopes to replace most other solutions by being fast, easy to set up, more adaptable with a more consistent interface.

It helps that there are already HUNDREDS of combinators which interop with lenses 😄.

As it turns out; regexes are a very lens-like tool; Traversals allow you to select and alter zero or more matches; traversals can even carry indexes so you know which match or group you're working on.

Examples

import Control.Lens.Regex.Text

txt :: Text
txt = "raindrops on roses and whiskers on kittens"

-- Search
>>> has [regex|whisk|] txt
True

-- Get matches
>>> txt ^.. [regex|\br\w+|] . match
["raindrops","roses"]

-- Edit matches
>>> txt & [regex|\br\w+|] . match %~ T.intersperse '-' . T.toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

-- Get Groups
>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

-- Edit Groups
>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

-- Get the third match
>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

-- Match integers, 'Read' them into ints, then sort them in-place
-- dumping them back into the source text afterwards.
>>> "Monday: 29, Tuesday: 99, Wednesday: 3" 
   & partsOf ([regex|\d+|] . match . unpacked . _Show @Int) %~ sort
"Monday: 3, Tuesday: 29, Wednesday: 99"

Basically anything you want to do is possible somehow.

Performance

See the benchmarks.

Summary

Caveat: I'm by no means a benchmarking expert; if you have tips on how to do this better I'm all ears!

  • Search lens-regex-pcre is marginally slower than pcre-heavy, but well within acceptable margins (within 0.6%)
  • Replace lens-regex-pcre beats pcre-heavy by ~10%
  • Modify pcre-heavy doesn't support this operation at all, so I guess lens-regex-pcre wins here :)

How can it possibly be faster if it's based on pcre-heavy? lens-regex-pcre only uses pcre-heavy for finding the matches, not substitution/replacement. After that it splits the text into chunks and traverses over them with whichever operation you've chosen. The nature of this implementation makes it a lot easier to understand than imperative implementations of the same thing. This means it's pretty easy to make edits, and is also the reason we can support arbitrary traversals/actions. It was easy enough, so I went ahead and made the whole thing use ByteString Builders, which sped it up a lot. I suspect that pcre-heavy can benefit from the same optimization if anyone feels like back-porting it; it could be (almost) as nicely using simple traverse without any lenses. The whole thing is only about 25 LOC.

I'm neither a benchmarks nor stats person, so please open an issue if anything here seems fishy.

Without pcre-light and pcre-heavy this library wouldn't be possible, so huge thanks to all contributors!

Here are the benchmarks on my 2013 Macbook (2.6 Ghz i5)

benchmarking static pattern search/pcre-heavy ... took 20.78 s, total 56 iterations
benchmarked static pattern search/pcre-heavy
time                 375.3 ms   (372.0 ms .. 378.5 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 378.1 ms   (376.4 ms .. 380.8 ms)
std dev              3.747 ms   (922.3 μs .. 5.609 ms)

benchmarking static pattern search/lens-regex-pcre ... took 20.79 s, total 56 iterations
benchmarked static pattern search/lens-regex-pcre
time                 379.5 ms   (376.2 ms .. 382.4 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 377.3 ms   (376.5 ms .. 378.4 ms)
std dev              1.667 ms   (1.075 ms .. 2.461 ms)

benchmarking complex pattern search/pcre-heavy ... took 95.95 s, total 56 iterations
benchmarked complex pattern search/pcre-heavy
time                 1.741 s    (1.737 s .. 1.746 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.746 s    (1.744 s .. 1.749 s)
std dev              4.499 ms   (3.186 ms .. 6.080 ms)

benchmarking complex pattern search/lens-regex-pcre ... took 97.26 s, total 56 iterations
benchmarked complex pattern search/lens-regex-pcre
time                 1.809 s    (1.736 s .. 1.908 s)
                     0.996 R²   (0.991 R² .. 1.000 R²)
mean                 1.757 s    (1.742 s .. 1.810 s)
std dev              42.83 ms   (11.51 ms .. 70.69 ms)

benchmarking simple replacement/pcre-heavy ... took 23.32 s, total 56 iterations
benchmarked simple replacement/pcre-heavy
time                 423.8 ms   (422.4 ms .. 425.3 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 424.0 ms   (422.9 ms .. 426.2 ms)
std dev              2.684 ms   (1.239 ms .. 4.270 ms)

benchmarking simple replacement/lens-regex-pcre ... took 20.84 s, total 56 iterations
benchmarked simple replacement/lens-regex-pcre
time                 382.8 ms   (374.3 ms .. 391.5 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 378.2 ms   (376.3 ms .. 381.0 ms)
std dev              3.794 ms   (2.577 ms .. 5.418 ms)

benchmarking complex replacement/pcre-heavy ... took 24.77 s, total 56 iterations
benchmarked complex replacement/pcre-heavy
time                 448.1 ms   (444.7 ms .. 450.0 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 450.8 ms   (449.5 ms .. 453.9 ms)
std dev              3.129 ms   (947.0 μs .. 4.841 ms)

benchmarking complex replacement/lens-regex-pcre ... took 21.99 s, total 56 iterations
benchmarked complex replacement/lens-regex-pcre
time                 399.9 ms   (398.4 ms .. 402.2 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 399.6 ms   (399.0 ms .. 400.4 ms)
std dev              1.135 ms   (826.2 μs .. 1.604 ms)

Benchmark lens-regex-pcre-bench: FINISH

Behaviour

Precise Expected behaviour (and examples) can be found in the test suites:

More Repositories

1

rasa

Extremely modular text editor built in Haskell
Haskell
612
star
2

slick

Static site generator built on Shake configured in Haskell
Haskell
204
star
3

void-space

Well-Typed Typing Tutor where you Type Types... in space... yup, you heard me
Haskell
140
star
4

wc

Beating unix `wc` in Haskell
Haskell
136
star
5

SitePipe

Yet another static site generator - non-opinionated, value-level. Less magic == easier to understand
Haskell
119
star
6

eve

An extensible event-driven application framework in haskell
Haskell
109
star
7

comonads-by-example

Comonads By Example Conference talk
Haskell
89
star
8

Firefly

Simple Haskell http framework
Haskell
87
star
9

json-to-haskell

In goes JSON, out comes a complete Haskell model complete with instances! CLI and web interface available.
Haskell
83
star
10

astar-monad

A smart A* search monad transformer which supports backtracking user-state!
Haskell
81
star
11

jet

A structural editor for JSON values
Haskell
78
star
12

Advent-Of-Code-Polyglot

Examples of "Advent Of Code" solutions in many programming languages.
Python
72
star
13

mad-props

Forward-propagating Constraint Solver monad. Good for solving Sudoku, N-Queens, etc.
Haskell
66
star
14

lens-csv

Lensy interface for parsing CSV's
Haskell
42
star
15

LumberJack

A terminal-ui log watcher written in Go using the Flux architecture
Go
38
star
16

conway

Conway's game of life in 100 lines or less!
Haskell
34
star
17

tempered

Templating engine based on shell interpolation
Haskell
31
star
18

unipatterns

Helpers which allow safe partial pattern matching in lambdas
Haskell
31
star
19

session-sauce

Shell plugin for managing tmux sessions
Shell
29
star
20

slick-template

A template for quickly building sites with slick
CSS
27
star
21

grids

Arbitrary dimension type-safe grids
Haskell
26
star
22

copy-pasta

Shell
26
star
23

dumbwaiter

Extensible HTTP Web server configured entirely by a yaml file
Haskell
25
star
24

haskell-stack-travis-ci

Dead simple setup tools for running a Haskell build matrix using stack for several versions.
Shell
23
star
25

lens-filesystem

Lens interface for your filesystem
Haskell
22
star
26

selections

Haskell Package for operating with selections over an underlying functor
Haskell
22
star
27

btt-quicknav

HTML overlay for quickly navigating your computer
JavaScript
19
star
28

lens-errors

Handling errors which occur deep inside lens-chains
Haskell
17
star
29

proton

Haskell Profunctor Optics experiments
Haskell
15
star
30

Type-Tac-Toe

Type-safe tic-tac-toe using Typesafe programming in Haskell
Haskell
15
star
31

wave-function-collapse

Wave function collapse procedural generation for arbitrary graphs
Haskell
15
star
32

catalyst

There are many category theory implementations, but this one is mine
Haskell
14
star
33

update-monad

An implementation of the Update Monad and a 'Free' version from https://danelahman.github.io/papers/types13postproc.pdf
Haskell
13
star
34

Candor

A toy Parser+Compiler+Typechecker
Haskell
12
star
35

recursive-zipper

Zippers for cofree types
Haskell
12
star
36

climbing-fp-ladder

A record of examples and anecdotes as I ascend the ladder of Functional Programming
12
star
37

charter

Haskell charting library
Haskell
10
star
38

trek

Haskell
10
star
39

vimprove

A series of daily tasks/info to learn vim from beginner to expert one day at a time.
Shell
9
star
40

react-tui

Haskell
9
star
41

dont-argue

Dead-simple command line arguments for python scripts.
Python
8
star
42

advent-of-code-haskell

Advent of Code Solutions in Haskell
Haskell
7
star
43

BoxKite

A very simple blog framework that emphasizes managing posts in a plain-text directory structure. Runs on Google App Engine, but can also be exported as a static site.
Python
7
star
44

flux-monoid

A monoid which counts changing values in a sequence
Haskell
6
star
45

ffs

A Fuse-compatible Functional File System with @isovector
Haskell
6
star
46

lens-friends

Just some lens combinator experiments :)
Haskell
5
star
47

jaunt

a jq clone in purescript
PureScript
5
star
48

json-to-haskell-web

Haskell
5
star
49

brick-filetree

A brick widget for exploring your filetree
Haskell
5
star
50

vim-committed

Sends Desktop notifications to remind you to commit.
Vim Script
4
star
51

CMPT481

Human Computer Interaction Project
JavaScript
4
star
52

rxjs-tutorial

Walkthrough of building a simple webapp using different rxjs patterns
TypeScript
4
star
53

jsonf

An educational JSON functor library for teaching recursion-schemes
Haskell
4
star
54

haskell-library-template

Template for Haskell libraries
Haskell
4
star
55

rust-advent-of-code

Rust
4
star
56

recursion-schemes-by-example

JavaScript
4
star
57

monad-suspend

Experimental Cost-Annotated Self-Yielding Coroutines
Haskell
4
star
58

game-genre-per-day

Weird and whimsical video game genres everyday!
TypeScript
4
star
59

reactive-streams

Reactive stream combinators in Haskell! Implementations of Rx primitives based on the 'machines' library
Haskell
4
star
60

professor

An experimental http server written entirely with profunctors
Haskell
4
star
61

Wirehack

A small circuit-building game built in Haskell
Haskell
3
star
62

ChrisPenner.github.io

Basic Website
HTML
3
star
63

json-to-haskell-purescript

Generate Haskell datatypes from json objects
Dhall
3
star
64

free-cached

Cache previous runs of free monads
Haskell
3
star
65

substrate

File substitution tools I need for my book
Haskell
3
star
66

cards-against-corona

Elm
3
star
67

chip8

Rust
3
star
68

type-arithmetic

Proofs of types as a semiring via Curry-Howard Isomorphism
Haskell
3
star
69

purescript-node-readline-aff

A wrapper around Node.ReadLine for use with the Aff Monad.
PureScript
3
star
70

Flow

An experimental Haskell FRP (streams) library
Haskell
3
star
71

j-lang-haskell

JLang combinators in Haskell
Haskell
3
star
72

purescript-flow

A redux-style application framework
PureScript
3
star
73

catalyst-build

Experimental build system based on composition of categories & arrows
Haskell
3
star
74

focus

cli utility for hacking and slashing data
Haskell
3
star
75

rx-prop

Propagator based reactive extensions library
Haskell
2
star
76

rsi

Structural regex based command pipelines
Haskell
2
star
77

propellant

Foray into propagator networks in Haskell
Haskell
2
star
78

concurrency-comparison

Comparison of basic concurrency primitives and tasks in Haskell and Golang
Haskell
2
star
79

mustache-shake

Build rules for compiling mustache templates using shake
Haskell
2
star
80

scavenger

A basic texting scavenger hunt using Twilio
Python
2
star
81

mailing-list-reader

Haskell
2
star
82

Kaleidoscope

Working through the Kaleidoscope llvm compiler project
Haskell
2
star
83

delve

Terminal UI File Browser
Haskell
2
star
84

free-contravariant

An exploration into free contravariant functors
Haskell
2
star
85

schemer

Uber basic scheme interpreter
Haskell
2
star
86

continuity

Composable Component Framework
Haskell
2
star
87

eve-cli

Terminal event handlers and rendering for `eve` programs
Haskell
2
star
88

sheets

Overly complex attempt at typesafe spreadsheets
Haskell
2
star
89

Simpleton-Algebraics

Learn about Functional Algebraic Types without making your head explode.
2
star
90

flags

Compiles a declarative bash script configuration into a 100% bash flags and argument parser.
Shell
2
star
91

dual-free

Library for combined free & cofree trees into a single type.
Haskell
2
star
92

cofree-zippers

Just an experiment, move along :)
Haskell
1
star
93

grids-images

Tools for interacting with images using grids
Haskell
1
star
94

overlord

Logs dashboard for all your local servers.
JavaScript
1
star
95

SPA-GAE-template

Single page application template for google app engine using react-redux
JavaScript
1
star
96

SpareSpeare

Shakespeare Filler Text Generator
Python
1
star
97

cmpt317

Python
1
star
98

test-specialization

Companion to a blog post on testing patterns in Haskell
Haskell
1
star
99

reified-dicts

Experiment to reify symbols/nats into constraints by matching them within a known set.
Haskell
1
star
100

unison-testing

Just a scrap unison codebase for testing
1
star