• Stars
    star
    210
  • Rank 180,884 (Top 4 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created almost 4 years ago
  • Updated 5 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast lexer to extract named exports via analysis from CommonJS modules

CJS Module Lexer

Build Status

A very fast JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module.

Outputs the list of named exports (exports.name = ...) and possible module reexports (module.exports = require('...')), including the common transpiler variations of these cases.

Forked from https://github.com/guybedford/es-module-lexer.

Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, see benchmarks for more info.

Project Status

This project is used in Node.js core for detecting the named exports available when importing a CJS module into ESM, and is maintained for this purpose.

PRs will be accepted and upstreamed for parser bugs, performance improvements or new syntax support only.

Detection patterns for this project are frozen. This is because adding any new export detection patterns would result in fragmented backwards-compatibility. Specifically, it would be very difficult to figure out why an ES module named export for CommonJS might work in newer Node.js versions but not older versions. This problem would only be discovered downstream of module authors, with the fix for module authors being to then have to understand which patterns in this project provide full backwards-compatibily. Rather, by fully freezing the detected patterns, if it works in any Node.js version it will work in any other. Build tools can also reliably treat the supported syntax for this project as a part of their output target for ensuring syntax support.

Usage

npm install cjs-module-lexer

For use in CommonJS:

const { parse } = require('cjs-module-lexer');

// `init` return a promise for parity with the ESM API, but you do not have to call it

const { exports, reexports } = parse(`
  // named exports detection
  module.exports.a = 'a';
  (function () {
    exports.b = 'b';
  })();
  Object.defineProperty(exports, 'c', { value: 'c' });
  /* exports.d = 'not detected'; */

  // reexports detection
  if (maybe) module.exports = require('./dep1.js');
  if (another) module.exports = require('./dep2.js');

  // literal exports assignments
  module.exports = { a, b: c, d, 'e': f }

  // __esModule detection
  Object.defineProperty(module.exports, '__esModule', { value: true })
`);

// exports === ['a', 'b', 'c', '__esModule']
// reexports === ['./dep1.js', './dep2.js']

When using the ESM version, Wasm is supported instead:

import { parse, init } from 'cjs-module-lexer';
// init needs to be called and waited upon
await init();
const { exports, reexports } = parse(source);

The Wasm build is around 1.5x faster and without a cold start.

Grammar

CommonJS exports matches are run against the source token stream.

The token grammar is:

IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words:
            "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum"

STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal.

MODULE_EXPORTS: `module` `.` `exports`

EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports`

EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=`

EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` STRING_LITERAL `]` `=`

EXPORTS_LITERAL_PROP: (IDENTIFIER  (`:` IDENTIFIER)?) | (STRING_LITERAL `:` IDENTIFIER)

EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE)

EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN

EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` STRING_LITERAL

EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {`
  (`enumerable: true,`)?
  (
    `value:` |
    `get` (`: function` IDENTIFIER? )?  `() {` return IDENTIFIER (`.` IDENTIFIER | `[` STRING_LITERAL `]`)? `;`? `}` `,`?
  )
  `})`

EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}`

REQUIRE: `require` `(` STRING_LITERAL `)`

EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` (`_interopRequireWildcard (`)? REQUIRE

MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE

EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE

EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {`
  (
    (
      `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`?
      (
        (`if (Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)) return` `;`?)?
        (`if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`)?
      )?
    ) |
    `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) (`&& !` (`Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)` | IDENTIFIER `.hasOwnProperty(` IDENTIFIER$2 `)`))? `)`
  )
  (
    EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? |
    `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get` (`: function` IDENTIFIER? )?  `() { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? `}` `,`? `})` `;`?
  )
  `})`

Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment.

  • The returned export names are taken to be the combination of:
    1. All IDENTIFIER and STRING_LITERAL slots for EXPORTS_MEMBER and EXPORTS_LITERAL matches.
    2. The first STRING_LITERAL slot for all EXPORTS_DEFINE_VALUE matches where that same string is not an EXPORTS_DEFINE match that is not also an EXPORTS_DEFINE_VALUE match.
  • The reexport specifiers are taken to be the combination of:
    1. The REQUIRE matches of the last matched of either MODULE_EXPORTS_ASSIGN or EXPORTS_LITERAL.
    2. All top-level EXPORT_STAR REQUIRE matches and EXPORTS_ASSIGN matches whose IDENTIFIER also matches the first IDENTIFIER in EXPORT_STAR_LIB.

Parsing Examples

Named Exports Parsing

The basic matching rules for named exports are exports.name, exports['name'] or Object.defineProperty(exports, 'name', ...). This matching is done without scope analysis and regardless of the expression position:

// DETECTS EXPORTS: a, b
(function (exports) {
  exports.a = 'a'; 
  exports['b'] = 'b';
})(exports);

Because there is no scope analysis, the above detection may overclassify:

// DETECTS EXPORTS: a, b, c
(function (exports, Object) {
  exports.a = 'a';
  exports['b'] = 'b';
  if (false)
    exports.c = 'c';
})(NOT_EXPORTS, NOT_OBJECT);

It will in turn underclassify in cases where the identifiers are renamed:

// DETECTS: NO EXPORTS
(function (e) {
  e.a = 'a';
  e['b'] = 'b';
})(exports);

Getter Exports Parsing

Object.defineProperty is detected for specifically value and getter forms returning an identifier or member expression:

// DETECTS: a, b, c, d, __esModule
Object.defineProperty(exports, 'a', {
  enumerable: true,
  get: function () {
    return q.p;
  }
});
Object.defineProperty(exports, 'b', {
  enumerable: true,
  get: function () {
    return q['p'];
  }
});
Object.defineProperty(exports, 'c', {
  enumerable: true,
  get () {
    return b;
  }
});
Object.defineProperty(exports, 'd', { value: 'd' });
Object.defineProperty(exports, '__esModule', { value: true });

Value properties are also detected specifically:

Object.defineProperty(exports, 'a', {
  value: 'no problem'
});

To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will opt-out of the getter matching:

// DETECTS: NO EXPORTS
Object.defineProperty(exports, 'a', {
  get () {
    return 'nope';
  }
});

if (false) {
  Object.defineProperty(module.exports, 'a', {
    get () {
      return dynamic();
    }
  })
}

Alternative object definition structures or getter function bodies are not detected:

// DETECTS: NO EXPORTS
Object.defineProperty(exports, 'a', {
  enumerable: false,
  get () {
    return p;
  }
});
Object.defineProperty(exports, 'b', {
  configurable: true,
  get () {
    return p;
  }
});
Object.defineProperty(exports, 'c', {
  get: () => p
});
Object.defineProperty(exports, 'd', {
  enumerable: true,
  get: function () {
    return dynamic();
  }
});
Object.defineProperty(exports, 'e', {
  enumerable: true,
  get () {
    return 'str';
  }
});

Object.defineProperties is also not supported.

Exports Object Assignment

A best-effort is made to detect module.exports object assignments, but because this is not a full parser, arbitrary expressions are not handled in the object parsing process.

Simple object definitions are supported:

// DETECTS EXPORTS: a, b, c
module.exports = {
  a,
  'b': b,
  c: c,
  ...d
};

Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored:

// DETECTS EXPORTS: a, b
module.exports = {
  a,
  ...d,
  b: require('c'),
  c: "not detected since require('c') above bails the object detection"
}

Object.defineProperties is not currently supported either.

module.exports reexport assignment

Any module.exports = require('mod') assignment is detected as a reexport, but only the last one is returned:

// DETECTS REEXPORTS: c
module.exports = require('a');
(module => module.exports = require('b'))(NOT_MODULE);
if (false) module.exports = require('c');

This is to avoid over-classification in Webpack bundles with externals which include module.exports = require('external') in their source for every external dependency.

In exports object assignment, any spread of require() are detected as multiple separate reexports:

// DETECTS REEXPORTS: a, b
module.exports = require('ignored');
module.exports = {
  ...require('a'),
  ...require('b')
};

Transpiler Re-exports

For named exports, transpiler output works well with the rules described above.

But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS. These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools.

For example, export * from 'external' is output by Babel as:

"use strict";

exports.__esModule = true;

var _external = require("external");

Object.keys(_external).forEach(function (key) {
  if (key === "default" || key === "__esModule") return;
  exports[key] = _external[key];
});

Where the var _external = require("external") is specifically detected as well as the Object.keys(_external) statement, down to the exact for of that entire expression including minor variations of the output. The _external and key identifiers are carefully matched in this detection.

Similarly for TypeScript, export * from 'external' is output as:

"use strict";
function __export(m) {
    for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p];
}
Object.defineProperty(exports, "__esModule", { value: true });
__export(require("external"));

Where the __export(require("external")) statement is explicitly detected as a reexport, including variations tslib.__export and __exportStar.

Environment Support

Node.js 10+, and all browsers with Web Assembly support.

JS Grammar Support

  • Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
  • Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
  • Always correctly parses valid JS source, but may parse invalid JS source without errors.

Benchmarks

Benchmarks can be run with npm run bench.

Current results:

JS Build:

Module load time
> 4ms
Cold Run, All Samples
test/samples/*.js (3635 KiB)
> 299ms

Warm Runs (average of 25 runs)
test/samples/angular.js (1410 KiB)
> 13.96ms
test/samples/angular.min.js (303 KiB)
> 4.72ms
test/samples/d3.js (553 KiB)
> 6.76ms
test/samples/d3.min.js (250 KiB)
> 4ms
test/samples/magic-string.js (34 KiB)
> 0.64ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (698 KiB)
> 8.48ms
test/samples/rollup.min.js (367 KiB)
> 5.36ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3635 KiB)
> 40.28ms

Wasm Build:

Module load time
> 10ms
Cold Run, All Samples
test/samples/*.js (3635 KiB)
> 43ms

Warm Runs (average of 25 runs)
test/samples/angular.js (1410 KiB)
> 9.32ms
test/samples/angular.min.js (303 KiB)
> 3.16ms
test/samples/d3.js (553 KiB)
> 5ms
test/samples/d3.min.js (250 KiB)
> 2.32ms
test/samples/magic-string.js (34 KiB)
> 0.16ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (698 KiB)
> 6.28ms
test/samples/rollup.min.js (367 KiB)
> 3.6ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3635 KiB)
> 27.76ms

Wasm Build Steps

To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases.

The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project.

The build through the Makefile is then run via make lib/lexer.wasm, which can also be triggered via npm run build-wasm to create dist/lexer.js.

On Windows it may be preferable to use the Linux subsystem.

After the Web Assembly build, the CJS build can be triggered via npm run build.

Optimization passes are run with Binaryen prior to publish to reduce the Web Assembly footprint.

License

MIT

More Repositories

1

node

Node.js JavaScript runtime ✨🐢🚀✨
JavaScript
97,973
star
2

node-v0.x-archive

Moved to https://github.com/nodejs/node
34,533
star
3

node-gyp

Node.js native addon build tool
Python
9,275
star
4

docker-node

Official Docker Image for Node.js 🐳 🐢 🚀
Dockerfile
7,872
star
5

http-parser

http request/response parser for c
C
6,223
star
6

nodejs.org

The Node.js® Website
TypeScript
5,819
star
7

undici

An HTTP/1.1 client, written from scratch for Node.js
JavaScript
5,782
star
8

Release

Node.js Release Working Group
3,803
star
9

nan

Native Abstractions for Node.js
C++
3,245
star
10

node-addon-examples

Node.js C++ addon examples from http://nodejs.org/docs/latest/api/addons.html
C++
2,332
star
11

nodejs.dev

A redesign of Nodejs.org built using Gatsby.js with React.js, TypeScript, and Remark.
TypeScript
2,297
star
12

corepack

Zero-runtime-dependency package acting as bridge between Node projects and their package managers
TypeScript
2,150
star
13

node-addon-api

Module for using Node-API from C++
C++
1,999
star
14

node-chakracore

Node.js on ChakraCore ✨🐢🚀✨
JavaScript
1,919
star
15

node-convergence-archive

Archive for node/io.js convergence work pre-3.0.0
JavaScript
1,837
star
16

llhttp

Port of http_parser to llparse
TypeScript
1,552
star
17

help

✨ Need help with Node.js? File an Issue here. 🚀
1,427
star
18

llnode

An lldb plugin for Node.js and V8, which enables inspection of JavaScript states for insights into Node.js processes and their core dumps.
C++
1,140
star
19

readable-stream

Node-core streams for userland
JavaScript
1,003
star
20

examples

A repository of runnable Node.js examples that go beyond "hello, world!"
JavaScript
589
star
21

mentorship

Node.js Mentorship Program Initiative
587
star
22

llparse

Generating parsers in LLVM IR
TypeScript
567
star
23

TSC

The Node.js Technical Steering Committee
JavaScript
557
star
24

citgm

Canary in the Gold Mine
JavaScript
539
star
25

http2

Working on an HTTP/2 implementation for Node.js Core
JavaScript
520
star
26

diagnostics

Node.js Diagnostics Working Group
513
star
27

security-wg

Node.js Ecosystem Security Working Group
JavaScript
482
star
28

build

Better build and test infra for Node.
Shell
469
star
29

next-10

Repository for discussion on strategic directions for next 10 years of Node.js
462
star
30

node-eps

Node.js Enhancement Proposals for discussion on future API additions/changes to Node core
446
star
31

education

A place to discover and contribute to education initiatives in Node.js
417
star
32

modules

Node.js Modules Team
413
star
33

package-maintenance

Repository for work for discussion of helping with maintenance of key packages in the ecosystem.
403
star
34

nodejs-zh-CN

node.js 中文化 & 中文社区
SCSS
395
star
35

node-v8

Experimental Node.js mirror on V8 lkgr ✨🐢🚀✨
Shell
392
star
36

performance

Node.js team focusing on performance
363
star
37

node-inspect

Code that's now part of node, previously `node debug` for `node --inspect`
JavaScript
339
star
38

node-report

Delivers a human-readable diagnostic summary, written to file.
C++
327
star
39

quic

This repository is no longer active.
JavaScript
298
star
40

nodejs-ko

node.js 한국 커뮤니티
Stylus
262
star
41

single-executable

This team aims to advance the state of the art in packaging Node.js applications as single standalone executables (SEAs) on all supported operating systems.
260
star
42

community-committee

The Node.js Community Committee (aka CommComm)
259
star
43

github-bot

@nodejs-github-bot's heart and soul
JavaScript
259
star
44

evangelism

Letting the world know how awesome Node.js is and how to get involved!
242
star
45

abi-stable-node

Repository used by the Node-API team to manage work related to Node-API and node-addon-api
JavaScript
239
star
46

abi-stable-node-addon-examples

Node Add-on Examples with PoC ABI stable API for native modules
C++
238
star
47

node-core-utils

CLI tools for Node.js Core collaborators
JavaScript
228
star
48

changelog-maker

A git log to CHANGELOG.md tool
JavaScript
225
star
49

iojs.org

JavaScript
219
star
50

uvwasi

WASI syscall API built atop libuv
C
217
star
51

unofficial-builds

Unofficial binaries for Node.js
Shell
215
star
52

installer

Electron based installer for Node.js.
JavaScript
194
star
53

getting-started

Getting started in Node.js!
187
star
54

web-server-frameworks

A place for Node.js Web-Server Framework authors and users to collaborate
179
star
55

repl

REPL rewrite for Node.js ✨🐢🚀✨
JavaScript
170
star
56

snap

Node.js snap source and updater
Shell
166
star
57

tooling

Advancing Node.js as a framework for writing great tools
164
star
58

code-and-learn

A series of workshop sprints for Node.js.
Dockerfile
163
star
59

benchmarking

Node.js Benchmarking Working Group
Shell
161
star
60

docker-iojs

Official Docker images from the io.js project
Shell
156
star
61

i18n

The Node.js Internationalization Working Group – A Community Committee initiative.
150
star
62

full-icu-npm

convenience loader for 'small-icu' node builds
JavaScript
150
star
63

postject

Easily inject arbitrary read-only resources into executable formats (Mach-O, PE, ELF) and use it at runtime.
JavaScript
148
star
64

admin

Administrative space for policies of the TSC
146
star
65

roadmap

This repository and working group has been retired.
135
star
66

gyp-next

A fork of the GYP build system for use in the Node.js projects
Python
120
star
67

nodejs-pt

Internacionalização & tradução para português referente ao site nodejs.org
108
star
68

dev-policy

node-foundation dev policy **draft**
108
star
69

promises

Promises Working Group Repository
107
star
70

loaders

ECMAScript Modules Loaders
107
star
71

nodejs-zh-TW

Node.js zh-TW
CSS
107
star
72

NG

Next Generation JavaScript IO Platform
103
star
73

nodejs-ja

Node.js 日本語ローカリゼーション
101
star
74

nodejs.org-archive

[DEPRECATED] Website repository for the Node.js project
Nginx
101
star
75

website-redesign

Facilitating the redesign of the nodejs.org website
99
star
76

node-core-test

Node 18's node:test, as an npm package
JavaScript
90
star
77

worker

Figuring out native (Web?)Worker support for Node
JavaScript
87
star
78

post-mortem

This WG is in the process of being folded into the Diagnostics WG.
85
star
79

inclusivity

Improving inclusivity in the node community
80
star
80

CTC

Node.js Core Technical Committee & Collaborators
80
star
81

nodejs-ru

Перевод io.js на русский язык
JavaScript
79
star
82

ecmascript-modules

A fork of Node.js to hash out ideas related to ESModules
JavaScript
72
star
83

docs

A place for documentation. (this repository is inactive)
71
star
84

webcrypto

This repository has been archived. The WebCrypto API has been implemented in recent versions of Node.js and does not require additional packages.
JavaScript
68
star
85

automation

Better automation for the Node.js project
66
star
86

api

API WG
61
star
87

email

MX server management for iojs.org (and eventually nodejs.org)
JavaScript
59
star
88

user-feedback

Node.js User Feedback Initiative
56
star
89

board

The Node Foundation Board of Directors
JavaScript
52
star
90

logos

Logo ideas
51
star
91

promise-use-cases

Short lived repository in order to discuss Node.js promise use cases in Collaborator Summit Berlin 2018
JavaScript
50
star
92

branch-diff

A tool to list print the commits on one git branch that are not on another using loose comparison
JavaScript
49
star
93

loaders-test

Examples demonstrating the Node.js ECMAScript Modules Loaders API
JavaScript
48
star
94

core-validate-commit

Validate commit messages for Node.js core
JavaScript
48
star
95

open-standards

Node.js Open Standards Team
43
star
96

version-management

Discussion Group for Version Management
42
star
97

hardware

Hardware Working Group
42
star
98

security-advisories

Security advisories for Node.js and the JavaScript ecosystem.
JavaScript
41
star
99

whatwg-stream

WIP Official support for WHATWG Stream in Node.js
37
star
100

vm

Repository for Discussion / Working on Multi-VM Related Issues and Ideas for Node.js
35
star