• Stars
    star
    226
  • Rank 176,514 (Top 4 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created over 10 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A source code transpiler that enables the use of ES2015 Unicode regular expressions in ES5.

regexpu Build status Code coverage status regexpu on npm

regexpu is a source code transpiler that enables the use of ES2015 Unicode regular expressions in JavaScript-of-today (ES5). It rewrites regular expressions that make use of the ES2015 u flag into equivalent ES5-compatible regular expressions.

Here’s an online demo.

Traceur v0.0.61+, Babel v1.5.0+, esnext v0.12.0+, and Bublé v0.12.0+ use regexpu for their u regexp transpilation. The REPL demos for Traceur, Babel, esnext, and Bublé let you try u regexps as well as other ES.next features.

Example

Consider a file named example-es2015.js with the following contents:

var string = 'foo💩bar';
var match = string.match(/foo(.)bar/u);
console.log(match[1]);
// → '💩'

// This regex matches any symbol from U+1F4A9 to U+1F4AB, and nothing else.
var regex = /[\u{1F4A9}-\u{1F4AB}]/u;
// The following regex is equivalent.
var alternative = /[💩-💫]/u;
console.log([
  regex.test('a'),  // false
  regex.test('💩'), // true
  regex.test('💪'), // true
  regex.test('💫'), // true
  regex.test('💬')  // false
]);

Let’s transpile it:

$ regexpu < example-es2015.js > example-es5.js

example-es5.js can now be used in ES5 environments. Its contents are as follows:

var string = 'foo💩bar';
var match = string.match(/foo((?:[\0-\t\x0B\f\x0E-\u2027\u202A-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))bar/);
console.log(match[1]);
// → '💩'

// This regex matches any symbol from U+1F4A9 to U+1F4AB, and nothing else.
var regex = /(?:\uD83D[\uDCA9-\uDCAB])/;
// The following regex is equivalent.
var alternative = /(?:\uD83D[\uDCA9-\uDCAB])/;
console.log([
  regex.test('a'),  // false
  regex.test('💩'), // true
  regex.test('💪'), // true
  regex.test('💫'), // true
  regex.test('💬')  // false
]);

Known limitations

  1. regexpu only transpiles regular expression literals, so things like RegExp('…', 'u') are not affected.
  2. regexpu doesn’t polyfill the RegExp.prototype.unicode getter because it’s not possible to do so without side effects.
  3. regexpu doesn’t support canonicalizing the contents of back-references in regular expressions with both the i and u flag set, since that would require transpiling/wrapping strings.
  4. regexpu doesn’t match lone low surrogates accurately. Unfortunately that is impossible to implement due to the lack of lookbehind support in JavaScript regular expressions.

Installation

To use regexpu programmatically, install it as a dependency via npm:

npm install regexpu --save-dev

To use the command-line interface, install regexpu globally:

npm install regexpu -g

API

regexpu.version

A string representing the semantic version number.

regexpu.rewritePattern(pattern, flags, options)

This is an alias for the rewritePattern function exported by regexpu-core. Please refer to that project’s documentation for more information.

regexpu.rewritePattern uses regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, it’s better to include it directly:

// Instead of…
const rewritePattern = require('regexpu').rewritePattern;

// Use this:
const rewritePattern = require('regexpu-core');

This prevents the Recast and Esprima dependencies from being loaded into memory.

regexpu.transformTree(ast, options) or its alias regexpu.transform(ast, options)

This function accepts an abstract syntax tree representing some JavaScript code, and returns a transformed version of the tree in which any regular expression literals that use the ES2015 u flag are rewritten in ES5.

const regexpu = require('regexpu');
const recast = require('recast');
const tree = recast.parse(code); // ES2015 code
const transformedTree = regexpu.transform(tree);
const result = recast.print(transformedTree);
console.log(result.code); // transpiled ES5 code
console.log(result.map); // source map

The optional options object is passed to regexpu-core’s rewritePattern. For a description of the available options, see its documentation.

regexpu.transformTree uses Recast, regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, it’s better to include it directly:

const transformTree = require('regexpu/transform-tree');

This prevents the Esprima dependency from being loaded into memory.

regexpu.transpileCode(code, options)

This function accepts a string representing some JavaScript code, and returns a transpiled version of this code tree in which any regular expression literals that use the ES2015 u flag are rewritten in ES5.

const es2015 = 'console.log(/foo.bar/u.test("foo💩bar"));';
const es5 = regexpu.transpileCode(es2015);
// → 'console.log(/foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar/.test("foo💩bar"));'

The optional options object recognizes the following properties:

The sourceFileName and sourceMapName properties must be provided if you want to generate source maps.

const result = regexpu.transpileCode(code, {
  'sourceFileName': 'es2015.js',
  'sourceMapName': 'es2015.js.map',
});
console.log(result.code); // transpiled source code
console.log(result.map); // source map

regexpu.transpileCode uses Esprima, Recast, regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, feel free to include it directly:

const transpileCode = require('regexpu/transpile-code');

Transpilers that use regexpu internally

If you’re looking for a general-purpose ES.next-to-ES5 transpiler with support for Unicode regular expressions, consider using one of these:

For maintainers

How to publish a new release

  1. On the main branch, bump the version number in package.json:

    npm version patch -m 'Release v%s'

    Instead of patch, use minor or major as needed.

    Note that this produces a Git commit + tag.

  2. Push the release commit and tag:

    git push && git push --tags

    Our CI then automatically publishes the new release to npm.

Author

twitter/mathias
Mathias Bynens

License

regexpu is available under the MIT license.

More Repositories

1

dotfiles

🔧 .files, including ~/.macos — sensible hacker defaults for macOS
Shell
29,301
star
2

jquery-placeholder

A jQuery plugin that enables HTML5 placeholder behavior for browsers that aren’t trying hard enough yet
JavaScript
3,983
star
3

he

A robust HTML entity encoder/decoder written in JavaScript.
JavaScript
3,289
star
4

evil.sh

🙊 Subtle and not-so-subtle shell tweaks that will slowly drive people insane.
Shell
2,159
star
5

small

Smallest possible syntactically valid files of different types
HTML
1,900
star
6

emoji-regex

A regular expression to match all Emoji-only symbols as per the Unicode Standard.
JavaScript
1,641
star
7

punycode.js

A robust Punycode converter that fully complies to RFC 3492 and RFC 5891.
JavaScript
1,479
star
8

mothereff.in

Web developer tools
JavaScript
1,024
star
9

esrever

A Unicode-aware string reverser written in JavaScript.
JavaScript
878
star
10

jsesc

Given some data, jsesc returns the shortest possible stringified & ASCII-safe representation of that data.
JavaScript
683
star
11

utf8.js

A robust JavaScript implementation of a UTF-8 encoder/decoder, as defined by the Encoding Standard.
JavaScript
539
star
12

base64

A robust base64 encoder/decoder that is fully compatible with `atob()` and btoa()`, written in JavaScript.
JavaScript
491
star
13

CSS.escape

A robust polyfill for the CSS.escape utility method as defined in CSSOM.
JavaScript
486
star
14

jsperf.com

jsPerf.com source code
JavaScript
473
star
15

regenerate

Generate JavaScript-compatible regular expressions based on a given set of Unicode symbols or code points.
JavaScript
353
star
16

php-url-shortener

Simple PHP URL shortener, as used on mths.be
PHP
334
star
17

tpyo

A small script that enables you to make typos in JavaScript property names. Powered by ES2015 proxies + Levenshtein string distance.
JavaScript
205
star
18

luamin

A Lua minifier written in JavaScript
JavaScript
188
star
19

cssesc

A JavaScript library for escaping CSS strings and identifiers while generating the shortest possible ASCII-only output.
HTML
150
star
20

String.prototype.startsWith

A robust & optimized ES3-compatible polyfill for the `String.prototype.startsWith` method in ECMAScript 6.
JavaScript
143
star
21

grunt-template

This Grunt plugin interpolates template files with any data you provide and saves the result to another file.
JavaScript
136
star
22

document.scrollingElement

A polyfill for document.scrollingElement as defined in the CSSOM specification.
JavaScript
131
star
23

jquery-visibility

Page Visibility shim for jQuery
JavaScript
129
star
24

jquery-details

World’s first <details>/<summary> polyfill™
HTML
121
star
25

rel-noopener

Quick demonstration of why `<a rel=noopener>` is needed.
HTML
114
star
26

quoted-printable

A robust & character encoding–agnostic JavaScript implementation of the `Quoted-Printable` content transfer encoding as defined by RFC 2045.
JavaScript
88
star
27

grunt-zopfli

A Grunt plugin for compressing files using Zopfli.
JavaScript
87
star
28

emoji-test-regex-pattern

A regular expression pattern for Java/JavaScript to match all emoji in the emoji-test.txt file provided by UTS#51.
JavaScript
79
star
29

jquery-smooth-scrolling

Smooth anchor scrolling plugin for jQuery.
JavaScript
73
star
30

String.prototype.includes

A robust & optimized ES3-compatible polyfill for the `String.prototype.contains` method in ECMAScript 6.
JavaScript
69
star
31

Array.from

A robust & optimized ES3-compatible polyfill for the `Array.from` method in ECMAScript 6.
JavaScript
66
star
32

regexpu-core

regexpu’s core functionality, i.e. `rewritePattern(pattern, flag, options)`, which enables rewriting regular expressions that make use of the ES6 `u` flag into equivalent ES5-compatible regular expression patterns.
JavaScript
63
star
33

jquery-slideshow

The simplest jQuery slideshow plugin. Evar.
JavaScript
61
star
34

String.fromCodePoint

A robust & optimized `String.fromCodePoint` polyfill, based on the ECMAScript 6 specification.
JavaScript
61
star
35

hashtag-regex

A regular expression to match hashtag identifiers as per the Unicode Standard.
JavaScript
60
star
36

custom.keylayout

Custom QWERTY/AZERTY .keylayout files for use with Apple keyboards
59
star
37

unicode-data

Python scripts that generate JavaScript-compatible Unicode data
JavaScript
59
star
38

grunt-yui-compressor

A Grunt plugin for compressing JavaScript and CSS files using YUI Compressor.
JavaScript
59
star
39

String.prototype.at

A robust & optimized ES3-compatible polyfill for the `String.prototype.at` proposal for ECMAScript 6/7.
JavaScript
55
star
40

String.prototype.codePointAt

A robust & optimized `String.prototype.codePointAt` polyfill, based on the ECMAScript 6 specification.
JavaScript
55
star
41

covid-19-vaccinations-germany

Historical data on COVID-19 vaccination doses administered in Germany, per state.
HTML
54
star
42

windows-1252

A robust JavaScript implementation of the windows-1252 character encoding as defined by the Encoding Standard.
JavaScript
44
star
43

flag-emoji-replacements

'🇩🇰🇲🇬'.replace('🇰🇲', '🇪🇨'); // → '🇩🇪🇨🇬'
JavaScript
38
star
44

unicode-tr51

Emoji data extracted from Unicode Technical Report #51.
JavaScript
38
star
45

String.prototype.endsWith

A robust & optimized ES3-compatible polyfill for the `String.prototype.endsWith` method in ECMAScript 6.
JavaScript
35
star
46

wtf-8

A well-tested WTF-8 encoder/decoder written in JavaScript.
JavaScript
34
star
47

caniunicode

Unicode version support across JavaScript features & engines
JavaScript
32
star
48

math-tex

A web component for mathematical typesetting using TeX notation.
HTML
27
star
49

jquery-noselect

A jQuery plugin which disables text selection on any element. Useful for UI elements; evil for pretty much everything else.
JavaScript
27
star
50

String.prototype.repeat

A robust & optimized ES3-compatible polyfill for the `String.prototype.repeat` method in ECMAScript 6.
JavaScript
27
star
51

windows-1251

A robust JavaScript implementation of the windows-1251 character encoding as defined by the Encoding Standard.
JavaScript
26
star
52

kali-linux-docker

Kali Linux Docker
Shell
26
star
53

bacon-cipher

A robust JavaScript implementation of Bacon’s cipher, a.k.a. the Baconian cipher.
JavaScript
24
star
54

jquery-custom-data-attributes

An easy setter/getter for HTML5 data-* attributes
JavaScript
21
star
55

rgi-emoji-regex-pattern

A JavaScript-compatible regular expression pattern to match all RGI emoji symbols and sequences as per the Unicode Standard and UTS#51.
JavaScript
21
star
56

q-encoding

A robust & character encoding–agnostic JavaScript implementation of the `Q` encoding as defined by RFC 2047.
JavaScript
20
star
57

babel-plugin-transform-unicode-property-regex

Compile Unicode property escapes in Unicode regular expressions to ES5 or ES6 that works in today’s environments.
JavaScript
19
star
58

regenerate-unicode-properties

A collection of Regenerate sets for Unicode various properties.
JavaScript
17
star
59

rot

Perform simple rotational letter substitution (such as ROT-13) in JavaScript.
JavaScript
17
star
60

regex-trie-cli

Create regular expression patterns based on a list of strings to be matched.
JavaScript
17
star
61

strip-combining-marks

Easily remove Unicode combining marks from strings.
JavaScript
16
star
62

Array.of

A robust & optimized ES3-compatible polyfill for the `Array.of` method in ECMAScript 6.
JavaScript
15
star
63

jquery-oninput

My `oninput` polyfill as a jQuery plugin
JavaScript
14
star
64

homebrew-ecmascript

Homebrew formulae for ECMAScript engines
Ruby
13
star
65

tibia.com-extension

User script that enhances the character info pages on Tibia.com.
HTML
13
star
66

is-ascii-safe

is-ascii-safe determines whether a given string is ASCII-safe, i.e. if it consists of ASCII characters (U+0000 to U+007F) only.
JavaScript
13
star
67

es-regexp-unicode-character-class-escapes

Proposal to improve the character class escape tokens `\d`, `\D`, `\w`, `\W`, and the word boundary assertions `\b` and `\B` in ES6 Unicode regular expressions (with the `u` flag).
12
star
68

unicode-canonical-property-names-ecmascript

The set of canonical Unicode property names supported in ECMAScript RegExp property escapes.
JavaScript
11
star
69

node-unshorten

URL unshortener for Node.js
JavaScript
11
star
70

RegExp.prototype.match

A robust & optimized ES3-compatible polyfill for the `RegExp.prototype.match` method in ECMAScript 6.
JavaScript
10
star
71

is-potential-custom-element-name

Check whether a given string matches the `PotentialCustomElementName` production as defined in the HTML Standard.
JavaScript
10
star
72

atom-blackboard

TextMate’s Blackboard theme, ported to Atom.
CSS
10
star
73

unicode-emoji-modifier-base

The set of Unicode symbols that can serve as a base for emoji modifiers, i.e. those with the `Emoji_Modifier_Base` property set to `Yes`.
JavaScript
9
star
74

strip-variation-selectors

Remove Unicode variation selectors from strings.
JavaScript
9
star
75

nginx-zopfli-test

This repository contains some files that make it easy to test whether Nginx is correctly serving Zopfli-pre-compressed files.
JavaScript
9
star
76

unicode-property-escapes-tests

Tests for RegExp Unicode property escapes
JavaScript
8
star
77

unicode-match-property-value-ecmascript

Match a Unicode property or property alias to its canonical property name per the algorithm used for RegExp Unicode property escapes in ECMAScript.
JavaScript
8
star
78

grunt-esmangle

A Grunt plugin for mangling or minifying JavaScript files using Esmangle.
JavaScript
8
star
79

unicode-property-value-aliases

Unicode property value alias mappings in JavaScript format.
JavaScript
7
star
80

css-dbg-stories

HTML
7
star
81

unicode-property-aliases-ecmascript

Unicode property alias mappings in JavaScript format for property names that are supported in ECMAScript RegExp property escapes.
JavaScript
7
star
82

unicode-property-aliases

Unicode property alias mappings in JavaScript format.
JavaScript
7
star
83

unicode-match-property-ecmascript

Match a given Unicode property or property alias to its canonical property name per the algorithm used for RegExp Unicode property escapes in ECMAScript.
JavaScript
7
star
84

iso-8859-2

A robust JavaScript implementation of the iso-8859-2 character encoding as defined by the Encoding Standard.
JavaScript
6
star
85

string-prototype-replace-regexp-benchmark

Generated JavaScript benchmarks for String.prototype.{replace,replaceAll} with global regular expressions based on emoji-test-regex-pattern.
JavaScript
6
star
86

idn-allowed-code-points-regex

A regular expression that matches any of the code points that Verisign allows by default in IDN.
JavaScript
6
star
87

pogotransfercalc

Easily calculate how many Pokémon you should transfer before kicking off an evolution spree in Pokémon GO.
Python
6
star
88

macintosh

A robust JavaScript implementation of the macintosh character encoding as defined by the Encoding Standard.
JavaScript
6
star
89

windows-874

A robust JavaScript implementation of the windows-874 character encoding as defined by the Encoding Standard.
JavaScript
5
star
90

swapcase

A letter case swapper with full Unicode support, i.e. based on the official Unicode case folding mappings.
JavaScript
5
star
91

RegExp.prototype.search

A robust & optimized ES3-compatible polyfill for the `RegExp.prototype.search` method in ECMAScript 6.
JavaScript
5
star
92

netlify-test

HTML
4
star
93

pogocpm2level

Easily calculate the level of a given Pokémon in Pokémon GO based on its total CP multiplier value.
Python
4
star
94

tibia-bosses

JavaScript
4
star
95

covid-19-vaccinations-munich

Archive of historical coronavirus data for Munich, Germany
HTML
4
star
96

windows-1250

A robust JavaScript implementation of the windows-1250 character encoding as defined by the Encoding Standard.
JavaScript
4
star
97

stack-exchange-logos

Stack Exchange logos in SVG format.
HTML
4
star
98

gulp-regexpu

Gulp plugin to transpile ES6 Unicode regular expressions to ES5 with regexpu.
JavaScript
4
star
99

is-ascii-safe-cli

is-ascii-safe-cli checks whether a given file (or list of files) is ASCII-safe, i.e. consisting of ASCII characters (U+0000 to U+007F) only.
JavaScript
4
star
100

windows-1257

A robust JavaScript implementation of the windows-1257 character encoding as defined by the Encoding Standard.
JavaScript
4
star