• Stars
    star
    100
  • Rank 340,703 (Top 7 %)
  • Language
    JavaScript
  • Created over 13 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A tokenizer that looks like a stream for JavaScript and node.js

Build Status

Synopsis

A wide purpose tokenizer for JavaScript. The interface follows more or less the WriteStream from node.js.

node-tokenizer is published on npm so you can install it with npm install tokenizer

How to

  • require the Tokenizer constructor
var Tokenizer = require('tokenizer');
  • construct one (we'll see what the callback is used for)
var t = new Tokenizer(mycallback);
  • add rules
t.addRule(/^my regex$/, 'type');
  • write or pump to it
t.write(data);
// or
stream.pipe(t);
  • listen for new tokens
t.on('token', function(token, type) {
    // do something useful
    // type is the type of the token (specified with addRule)
    // token is the actual matching string
})
// alternatively you can use the tokenizer as a readable stream.
  • look out for the end
t.on('end', callback);

the optional callback argument for the constructor is a function that will be called for each token in order to specify a different type by returning a string. The parameters passed to the function are token(the token that we found) and match, an object like this

{
    regex: /whatever/ // the regex that matched the token
    type: 'type' // the type of the token
}

Have a look in the example folder

Rules

rules are regular expressions associated with a type name. The tokenizer tries to find the longest string matching one or more rules. When several rules match the same string, priority is given to the rule which was added first. (this may change)

Please note that your regular expressions should use ^ and $ in order to test the whole string. If these are not used, you rule will match every string that contains what you specified, this could be the whole file!

To do

  • a lot of optimisation
  • being able to share rules across several tokenizers (although this can be achieved through inheritance)
  • probably more hooks
  • more checking

License

MIT

Copyright (c) 2012 Florent Jaby

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

More Repositories

1

node-libspotify

Node bindings for the libspotify C library
C++
103
star
2

node-json-streams

Streams for parsing and stringifying large JSON objects without buffering
JavaScript
79
star
3

node-url-assembler

Assemble urls from route-like templates (/path/:param)
JavaScript
39
star
4

node-parser

a generic parser to parse whatever you want in node.js
JavaScript
25
star
5

node-stream-stream

A stream of streams in order to concatenate the contents of several streams
JavaScript
21
star
6

konga-cli

Command-line client for the Kong admin (http://getkong.org)
JavaScript
18
star
7

node-blue

JSP-like, streamed template engine
JavaScript
14
star
8

node-envie

A tiny module to read and document environment configuration
JavaScript
12
star
9

vim-config

My vim config
Vim Script
11
star
10

node-stream-sink

Collect all data piped to this stream when it closes
JavaScript
11
star
11

node-stream-json-stringify

JSON.stringify, streaming, non-blocking, for real this time.
JavaScript
10
star
12

node-lines

tiny utility to process streams line per line
JavaScript
9
star
13

node-meme

Generate memes from http://memegenerator.net - largely inspired by drbrain/meme
JavaScript
8
star
14

node-sse-writer

Creates a text/event-stream stream as specified by the WD-eventsource W3C recommendation
JavaScript
8
star
15

node-umzug-dynamodb-storage

A storage backend for migrating dynamoDB tables with umzug
JavaScript
5
star
16

node-json-tokenizer

A streaming JSON tokenizer
JavaScript
5
star
17

node-disect

Bisection helper for Javascript
JavaScript
4
star
18

node-duplex-maker

Create a duplex stream from a writable and a readable
JavaScript
3
star
19

hashtag-influenceur

Un jeu de carte oΓΉ l'on incarne un influenceur linkedin
JavaScript
3
star
20

node-stream-blackhole

A silly writable stream that eats all data
JavaScript
3
star
21

node-catstream

Filenames go in, contents come out. You can't explain that.
JavaScript
3
star
22

node-voice

A bit of code to make my computer speak remotely
JavaScript
3
star
23

node-object-iterator

A module to walk through an object with an iterator
JavaScript
3
star
24

git-split

yet another script to split subtrees apart in submodules
Shell
3
star
25

Drupal-Semaine-CMS

Projet CMS Hetic
PHP
2
star
26

node-microauth2

Minimal tool to start securing your API with OAuth2
JavaScript
2
star
27

node-services-as-promised

A DI container using promises
JavaScript
2
star
28

lambda-left-pad

left-pad as a service
JavaScript
2
star
29

Purple

Apache Module for Server-Side JavaScript
C++
2
star
30

mocha-rest-interface

Specify your REST endpoints behaviours like a pro and generate their docs
JavaScript
2
star
31

node-exist

eXist XML database REST API wrapper
JavaScript
2
star
32

node-ka-ching

A caching module for streams
JavaScript
2
star
33

node-http-measuring-client

Like the http module, except with stats
JavaScript
2
star
34

node-deep-getset

Utilities to get and set stuff on deeply nested structures
JavaScript
1
star
35

npmake

Make with all your local npm modules
JavaScript
1
star
36

nigel

Proof of concept of a prototype-oriented CMS
JavaScript
1
star
37

questionnaire-snapcity

JavaScript
1
star
38

node-prophet

CLI Wizard utility, with promises
JavaScript
1
star
39

jest-expect-error

missing `expect.error()` from jest as `expectError()`
JavaScript
1
star
40

article-les-statuts-ca-pue

CSS
1
star
41

node-geste

A small module with an executable that calls a requirable jest-cli
JavaScript
1
star
42

npm2dock

Convert any npm package into a runnable container
Shell
1
star
43

Node-Websockets-Test

JavaScript
1
star
44

dumbal.flo.by

A website / mobile app to keep the score when playing Dumbal.
JavaScript
1
star
45

node-sse-reader

reads a text/event-stream stream as specified by the WD-eventsource W3C recommendation
JavaScript
1
star
46

shi-flo-by

Online real-time shi fu mi with leapmotion support.
JavaScript
1
star
47

floby.github.com

JavaScript
1
star
48

centurion-module-search

Wide purpose search module for Centurion CMS
1
star
49

nodejsparis-addons

Les addons natifs binaires en node.js avec libspotify
JavaScript
1
star
50

node-office-music

A small local webapp to play music in an openspace office
1
star
51

node-stream-write-read

Write to a file, read when it's done
JavaScript
1
star
52

node-dummy-streaming-array-parser

A stream for getting each line of a JSON array
JavaScript
1
star
53

mosic

Mocha + Sinon + Chai as a single setup + other stuff maybe
JavaScript
1
star
54

Echo

A P2P decentralized Instant Messenger
JavaScript
1
star
55

node-cache-depend

A utility function to detect when you should invalidate your cached data
JavaScript
1
star