• Stars
    star
    226
  • Rank 176,514 (Top 4 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created over 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast lexer to extract named exports via analysis from CommonJS modules

CJS Module Lexer

Build Status

A very fast JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module.

Outputs the list of named exports (exports.name = ...) and possible module reexports (module.exports = require('...')), including the common transpiler variations of these cases.

Forked from https://github.com/guybedford/es-module-lexer.

Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, see benchmarks for more info.

Project Status

This project is used in Node.js core for detecting the named exports available when importing a CJS module into ESM, and is maintained for this purpose.

PRs will be accepted and upstreamed for parser bugs, performance improvements or new syntax support only.

Detection patterns for this project are frozen. This is because adding any new export detection patterns would result in fragmented backwards-compatibility. Specifically, it would be very difficult to figure out why an ES module named export for CommonJS might work in newer Node.js versions but not older versions. This problem would only be discovered downstream of module authors, with the fix for module authors being to then have to understand which patterns in this project provide full backwards-compatibily. Rather, by fully freezing the detected patterns, if it works in any Node.js version it will work in any other. Build tools can also reliably treat the supported syntax for this project as a part of their output target for ensuring syntax support.

Usage

npm install cjs-module-lexer

For use in CommonJS:

const { parse } = require('cjs-module-lexer');

// `init` return a promise for parity with the ESM API, but you do not have to call it

const { exports, reexports } = parse(`
  // named exports detection
  module.exports.a = 'a';
  (function () {
    exports.b = 'b';
  })();
  Object.defineProperty(exports, 'c', { value: 'c' });
  /* exports.d = 'not detected'; */

  // reexports detection
  if (maybe) module.exports = require('./dep1.js');
  if (another) module.exports = require('./dep2.js');

  // literal exports assignments
  module.exports = { a, b: c, d, 'e': f }

  // __esModule detection
  Object.defineProperty(module.exports, '__esModule', { value: true })
`);

// exports === ['a', 'b', 'c', '__esModule']
// reexports === ['./dep1.js', './dep2.js']

When using the ESM version, Wasm is supported instead:

import { parse, init } from 'cjs-module-lexer';
// init needs to be called and waited upon
await init();
const { exports, reexports } = parse(source);

The Wasm build is around 1.5x faster and without a cold start.

Grammar

CommonJS exports matches are run against the source token stream.

The token grammar is:

IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words:
            "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum"

STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal.

MODULE_EXPORTS: `module` `.` `exports`

EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports`

EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=`

EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` STRING_LITERAL `]` `=`

EXPORTS_LITERAL_PROP: (IDENTIFIER  (`:` IDENTIFIER)?) | (STRING_LITERAL `:` IDENTIFIER)

EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE)

EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN

EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` STRING_LITERAL

EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {`
  (`enumerable: true,`)?
  (
    `value:` |
    `get` (`: function` IDENTIFIER? )?  `() {` return IDENTIFIER (`.` IDENTIFIER | `[` STRING_LITERAL `]`)? `;`? `}` `,`?
  )
  `})`

EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}`

REQUIRE: `require` `(` STRING_LITERAL `)`

EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` (`_interopRequireWildcard (`)? REQUIRE

MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE

EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE

EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {`
  (
    (
      `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`?
      (
        (`if (Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)) return` `;`?)?
        (`if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`)?
      )?
    ) |
    `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) (`&& !` (`Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)` | IDENTIFIER `.hasOwnProperty(` IDENTIFIER$2 `)`))? `)`
  )
  (
    EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? |
    `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get` (`: function` IDENTIFIER? )?  `() { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? `}` `,`? `})` `;`?
  )
  `})`

Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment.

  • The returned export names are taken to be the combination of:
    1. All IDENTIFIER and STRING_LITERAL slots for EXPORTS_MEMBER and EXPORTS_LITERAL matches.
    2. The first STRING_LITERAL slot for all EXPORTS_DEFINE_VALUE matches where that same string is not an EXPORTS_DEFINE match that is not also an EXPORTS_DEFINE_VALUE match.
  • The reexport specifiers are taken to be the combination of:
    1. The REQUIRE matches of the last matched of either MODULE_EXPORTS_ASSIGN or EXPORTS_LITERAL.
    2. All top-level EXPORT_STAR REQUIRE matches and EXPORTS_ASSIGN matches whose IDENTIFIER also matches the first IDENTIFIER in EXPORT_STAR_LIB.

Parsing Examples

Named Exports Parsing

The basic matching rules for named exports are exports.name, exports['name'] or Object.defineProperty(exports, 'name', ...). This matching is done without scope analysis and regardless of the expression position:

// DETECTS EXPORTS: a, b
(function (exports) {
  exports.a = 'a'; 
  exports['b'] = 'b';
})(exports);

Because there is no scope analysis, the above detection may overclassify:

// DETECTS EXPORTS: a, b, c
(function (exports, Object) {
  exports.a = 'a';
  exports['b'] = 'b';
  if (false)
    exports.c = 'c';
})(NOT_EXPORTS, NOT_OBJECT);

It will in turn underclassify in cases where the identifiers are renamed:

// DETECTS: NO EXPORTS
(function (e) {
  e.a = 'a';
  e['b'] = 'b';
})(exports);

Getter Exports Parsing

Object.defineProperty is detected for specifically value and getter forms returning an identifier or member expression:

// DETECTS: a, b, c, d, __esModule
Object.defineProperty(exports, 'a', {
  enumerable: true,
  get: function () {
    return q.p;
  }
});
Object.defineProperty(exports, 'b', {
  enumerable: true,
  get: function () {
    return q['p'];
  }
});
Object.defineProperty(exports, 'c', {
  enumerable: true,
  get () {
    return b;
  }
});
Object.defineProperty(exports, 'd', { value: 'd' });
Object.defineProperty(exports, '__esModule', { value: true });

Value properties are also detected specifically:

Object.defineProperty(exports, 'a', {
  value: 'no problem'
});

To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will opt-out of the getter matching:

// DETECTS: NO EXPORTS
Object.defineProperty(exports, 'a', {
  get () {
    return 'nope';
  }
});

if (false) {
  Object.defineProperty(module.exports, 'a', {
    get () {
      return dynamic();
    }
  })
}

Alternative object definition structures or getter function bodies are not detected:

// DETECTS: NO EXPORTS
Object.defineProperty(exports, 'a', {
  enumerable: false,
  get () {
    return p;
  }
});
Object.defineProperty(exports, 'b', {
  configurable: true,
  get () {
    return p;
  }
});
Object.defineProperty(exports, 'c', {
  get: () => p
});
Object.defineProperty(exports, 'd', {
  enumerable: true,
  get: function () {
    return dynamic();
  }
});
Object.defineProperty(exports, 'e', {
  enumerable: true,
  get () {
    return 'str';
  }
});

Object.defineProperties is also not supported.

Exports Object Assignment

A best-effort is made to detect module.exports object assignments, but because this is not a full parser, arbitrary expressions are not handled in the object parsing process.

Simple object definitions are supported:

// DETECTS EXPORTS: a, b, c
module.exports = {
  a,
  'b': b,
  c: c,
  ...d
};

Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored:

// DETECTS EXPORTS: a, b
module.exports = {
  a,
  ...d,
  b: require('c'),
  c: "not detected since require('c') above bails the object detection"
}

Object.defineProperties is not currently supported either.

module.exports reexport assignment

Any module.exports = require('mod') assignment is detected as a reexport, but only the last one is returned:

// DETECTS REEXPORTS: c
module.exports = require('a');
(module => module.exports = require('b'))(NOT_MODULE);
if (false) module.exports = require('c');

This is to avoid over-classification in Webpack bundles with externals which include module.exports = require('external') in their source for every external dependency.

In exports object assignment, any spread of require() are detected as multiple separate reexports:

// DETECTS REEXPORTS: a, b
module.exports = require('ignored');
module.exports = {
  ...require('a'),
  ...require('b')
};

Transpiler Re-exports

For named exports, transpiler output works well with the rules described above.

But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS. These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools.

For example, export * from 'external' is output by Babel as:

"use strict";

exports.__esModule = true;

var _external = require("external");

Object.keys(_external).forEach(function (key) {
  if (key === "default" || key === "__esModule") return;
  exports[key] = _external[key];
});

Where the var _external = require("external") is specifically detected as well as the Object.keys(_external) statement, down to the exact for of that entire expression including minor variations of the output. The _external and key identifiers are carefully matched in this detection.

Similarly for TypeScript, export * from 'external' is output as:

"use strict";
function __export(m) {
    for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p];
}
Object.defineProperty(exports, "__esModule", { value: true });
__export(require("external"));

Where the __export(require("external")) statement is explicitly detected as a reexport, including variations tslib.__export and __exportStar.

Environment Support

Node.js 10+, and all browsers with Web Assembly support.

JS Grammar Support

  • Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
  • Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
  • Always correctly parses valid JS source, but may parse invalid JS source without errors.

Benchmarks

Benchmarks can be run with npm run bench.

Current results:

JS Build:

Module load time
> 4ms
Cold Run, All Samples
test/samples/*.js (3635 KiB)
> 299ms

Warm Runs (average of 25 runs)
test/samples/angular.js (1410 KiB)
> 13.96ms
test/samples/angular.min.js (303 KiB)
> 4.72ms
test/samples/d3.js (553 KiB)
> 6.76ms
test/samples/d3.min.js (250 KiB)
> 4ms
test/samples/magic-string.js (34 KiB)
> 0.64ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (698 KiB)
> 8.48ms
test/samples/rollup.min.js (367 KiB)
> 5.36ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3635 KiB)
> 40.28ms

Wasm Build:

Module load time
> 10ms
Cold Run, All Samples
test/samples/*.js (3635 KiB)
> 43ms

Warm Runs (average of 25 runs)
test/samples/angular.js (1410 KiB)
> 9.32ms
test/samples/angular.min.js (303 KiB)
> 3.16ms
test/samples/d3.js (553 KiB)
> 5ms
test/samples/d3.min.js (250 KiB)
> 2.32ms
test/samples/magic-string.js (34 KiB)
> 0.16ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (698 KiB)
> 6.28ms
test/samples/rollup.min.js (367 KiB)
> 3.6ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3635 KiB)
> 27.76ms

Wasm Build Steps

To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases.

The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project.

The build through the Makefile is then run via make lib/lexer.wasm, which can also be triggered via npm run build-wasm to create dist/lexer.js.

On Windows it may be preferable to use the Linux subsystem.

After the Web Assembly build, the CJS build can be triggered via npm run build.

Optimization passes are run with Binaryen prior to publish to reduce the Web Assembly footprint.

License

MIT

More Repositories

1

node

Node.js JavaScript runtime ✨🐢🚀✨
JavaScript
97,973
star
2

node-v0.x-archive

Moved to https://github.com/nodejs/node
34,533
star
3

node-gyp

Node.js native addon build tool
Python
9,275
star
4

docker-node

Official Docker Image for Node.js 🐳 🐢 🚀
Dockerfile
7,872
star
5

http-parser

http request/response parser for c
C
6,223
star
6

undici

An HTTP/1.1 client, written from scratch for Node.js
JavaScript
6,182
star
7

nodejs.org

The Node.js® Website
TypeScript
6,020
star
8

Release

Node.js Release Working Group
3,803
star
9

nan

Native Abstractions for Node.js
C++
3,277
star
10

corepack

Zero-runtime-dependency package acting as bridge between Node projects and their package managers
TypeScript
2,542
star
11

node-addon-examples

Node.js C++ addon examples from http://nodejs.org/docs/latest/api/addons.html
C++
2,332
star
12

nodejs.dev

A redesign of Nodejs.org built using Gatsby.js with React.js, TypeScript, and Remark.
TypeScript
2,293
star
13

node-addon-api

Module for using Node-API from C++
C++
2,162
star
14

node-chakracore

Node.js on ChakraCore ✨🐢🚀✨
JavaScript
1,921
star
15

node-convergence-archive

Archive for node/io.js convergence work pre-3.0.0
JavaScript
1,837
star
16

llhttp

Port of http_parser to llparse
TypeScript
1,665
star
17

help

✨ Need help with Node.js? File an Issue here. 🚀
1,473
star
18

llnode

An lldb plugin for Node.js and V8, which enables inspection of JavaScript states for insights into Node.js processes and their core dumps.
C++
1,151
star
19

readable-stream

Node-core streams for userland
JavaScript
1,003
star
20

examples

A repository of runnable Node.js examples that go beyond "hello, world!"
JavaScript
652
star
21

TSC

The Node.js Technical Steering Committee
JavaScript
592
star
22

llparse

Generating parsers in LLVM IR
TypeScript
586
star
23

mentorship

Node.js Mentorship Program Initiative
585
star
24

citgm

Canary in the Gold Mine
JavaScript
567
star
25

http2

Working on an HTTP/2 implementation for Node.js Core
JavaScript
520
star
26

diagnostics

Node.js Diagnostics Working Group
513
star
27

security-wg

Node.js Ecosystem Security Working Group
JavaScript
495
star
28

next-10

Repository for discussion on strategic directions for next 10 years of Node.js
480
star
29

build

Better build and test infra for Node.
Shell
469
star
30

node-eps

Node.js Enhancement Proposals for discussion on future API additions/changes to Node core
442
star
31

education

A place to discover and contribute to education initiatives in Node.js
417
star
32

node-v8

Experimental Node.js mirror on V8 lkgr ✨🐢🚀✨
Shell
416
star
33

modules

Node.js Modules Team
413
star
34

package-maintenance

Repository for work for discussion of helping with maintenance of key packages in the ecosystem.
407
star
35

nodejs-zh-CN

node.js 中文化 & 中文社区
SCSS
395
star
36

performance

Node.js team focusing on performance
Shell
376
star
37

node-inspect

Code that's now part of node, previously `node debug` for `node --inspect`
JavaScript
340
star
38

node-report

Delivers a human-readable diagnostic summary, written to file.
C++
326
star
39

single-executable

This team aims to advance the state of the art in packaging Node.js applications as single standalone executables (SEAs) on all supported operating systems.
306
star
40

quic

This repository is no longer active.
JavaScript
301
star
41

github-bot

@nodejs-github-bot's heart and soul
JavaScript
267
star
42

community-committee

The Node.js Community Committee (aka CommComm)
263
star
43

nodejs-ko

node.js 한국 커뮤니티
Stylus
263
star
44

amaro

Node.js TypeScript wrapper
JavaScript
261
star
45

node-core-utils

CLI tools for Node.js Core collaborators
JavaScript
253
star
46

unofficial-builds

Unofficial binaries for Node.js
Shell
252
star
47

evangelism

Letting the world know how awesome Node.js is and how to get involved!
242
star
48

abi-stable-node

Repository used by the Node-API team to manage work related to Node-API and node-addon-api
JavaScript
241
star
49

abi-stable-node-addon-examples

Node Add-on Examples with PoC ABI stable API for native modules
C++
237
star
50

changelog-maker

A git log to CHANGELOG.md tool
JavaScript
230
star
51

uvwasi

WASI syscall API built atop libuv
C
228
star
52

iojs.org

JavaScript
219
star
53

installer

Electron based installer for Node.js.
JavaScript
194
star
54

getting-started

Getting started in Node.js!
193
star
55

postject

Easily inject arbitrary read-only resources into executable formats (Mach-O, PE, ELF) and use it at runtime.
JavaScript
186
star
56

web-server-frameworks

A place for Node.js Web-Server Framework authors and users to collaborate
182
star
57

repl

REPL rewrite for Node.js ✨🐢🚀✨
JavaScript
178
star
58

tooling

Advancing Node.js as a framework for writing great tools
170
star
59

snap

Node.js snap source and updater
Shell
168
star
60

code-and-learn

A series of workshop sprints for Node.js.
Dockerfile
164
star
61

benchmarking

Node.js Benchmarking Working Group
Shell
161
star
62

admin

Administrative space for policies of the TSC
JavaScript
157
star
63

docker-iojs

Official Docker images from the io.js project
Shell
156
star
64

full-icu-npm

convenience loader for 'small-icu' node builds
JavaScript
152
star
65

i18n

The Node.js Internationalization Working Group – A Community Committee initiative.
150
star
66

roadmap

This repository and working group has been retired.
135
star
67

gyp-next

A fork of the GYP build system for use in the Node.js projects
Python
131
star
68

loaders

ECMAScript Modules Loaders
128
star
69

nodejs-pt

Internacionalização & tradução para português referente ao site nodejs.org
108
star
70

dev-policy

node-foundation dev policy **draft**
108
star
71

promises

Promises Working Group Repository
107
star
72

nodejs-zh-TW

Node.js zh-TW
CSS
107
star
73

NG

Next Generation JavaScript IO Platform
103
star
74

nodejs-ja

Node.js 日本語ローカリゼーション
101
star
75

nodejs.org-archive

[DEPRECATED] Website repository for the Node.js project
Nginx
101
star
76

website-redesign

Facilitating the redesign of the nodejs.org website
99
star
77

node-core-test

Node 18's node:test, as an npm package
JavaScript
95
star
78

worker

Figuring out native (Web?)Worker support for Node
JavaScript
87
star
79

post-mortem

This WG is in the process of being folded into the Diagnostics WG.
85
star
80

typescript

TypeScript support in Node.js core
83
star
81

inclusivity

Improving inclusivity in the node community
80
star
82

CTC

Node.js Core Technical Committee & Collaborators
80
star
83

nodejs-ru

Перевод io.js на русский язык
JavaScript
79
star
84

ecmascript-modules

A fork of Node.js to hash out ideas related to ESModules
JavaScript
73
star
85

docs

A place for documentation. (this repository is inactive)
71
star
86

webcrypto

This repository has been archived. The WebCrypto API has been implemented in recent versions of Node.js and does not require additional packages.
JavaScript
69
star
87

import-in-the-middle

Like `require-in-the-middle`, but for ESM import
JavaScript
67
star
88

automation

Better automation for the Node.js project
66
star
89

api

API WG
61
star
90

email

MX server management for iojs.org (and eventually nodejs.org)
JavaScript
60
star
91

user-feedback

Node.js User Feedback Initiative
56
star
92

loaders-test

Examples demonstrating the Node.js ECMAScript Modules Loaders API
JavaScript
54
star
93

core-validate-commit

Validate commit messages for Node.js core
JavaScript
52
star
94

board

The Node Foundation Board of Directors
JavaScript
52
star
95

branch-diff

A tool to list print the commits on one git branch that are not on another using loose comparison
JavaScript
52
star
96

logos

Logo ideas
51
star
97

promise-use-cases

Short lived repository in order to discuss Node.js promise use cases in Collaborator Summit Berlin 2018
JavaScript
50
star
98

open-standards

Node.js Open Standards Team
43
star
99

version-management

Discussion Group for Version Management
42
star
100

hardware

Hardware Working Group
42
star