• Stars
    star
    1,744
  • Rank 25,750 (Top 0.6 %)
  • Language
    JavaScript
  • License
    Other
  • Created over 11 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Distribute processing tasks to child processes with an über-simple API and baked-in durability & custom concurrency options.

Worker Farm Build Status

NPM

Distribute processing tasks to child processes with an über-simple API and baked-in durability & custom concurrency options. Available in npm as worker-farm.

Example

Given a file, child.js:

module.exports = function (inp, callback) {
  callback(null, inp + ' BAR (' + process.pid + ')')
}

And a main file:

var workerFarm = require('worker-farm')
  , workers    = workerFarm(require.resolve('./child'))
  , ret        = 0

for (var i = 0; i < 10; i++) {
  workers('#' + i + ' FOO', function (err, outp) {
    console.log(outp)
    if (++ret == 10)
      workerFarm.end(workers)
  })
}

We'll get an output something like the following:

#1 FOO BAR (8546)
#0 FOO BAR (8545)
#8 FOO BAR (8545)
#9 FOO BAR (8546)
#2 FOO BAR (8548)
#4 FOO BAR (8551)
#3 FOO BAR (8549)
#6 FOO BAR (8555)
#5 FOO BAR (8553)
#7 FOO BAR (8557)

This example is contained in the examples/basic directory.

Example #1: Estimating π using child workers

You will also find a more complex example in examples/pi that estimates the value of π by using a Monte Carlo area-under-the-curve method and compares the speed of doing it all in-process vs using child workers to complete separate portions.

Running node examples/pi will give you something like:

Doing it the slow (single-process) way...
π ≈ 3.1416269360000006  (0.0000342824102075312 away from actual!)
took 8341 milliseconds
Doing it the fast (multi-process) way...
π ≈ 3.1416233600000036  (0.00003070641021052367 away from actual!)
took 1985 milliseconds

Durability

An important feature of Worker Farm is call durability. If a child process dies for any reason during the execution of call(s), those calls will be re-queued and taken care of by other child processes. In this way, when you ask for something to be done, unless there is something seriously wrong with what you're doing, you should get a result on your callback function.

My use-case

There are other libraries for managing worker processes available but my use-case was fairly specific: I need to make heavy use of the node-java library to interact with JVM code. Unfortunately, because the JVM garbage collector is so difficult to interact with, it's prone to killing your Node process when the GC kicks under heavy load. For safety I needed a durable way to make calls so that (1) it wouldn't kill my main process and (2) any calls that weren't successful would be resubmitted for processing.

Worker Farm allows me to spin up multiple JVMs to be controlled by Node, and have a single, uncomplicated API that acts the same way as an in-process API and the calls will be taken care of by a child process even if an error kills a child process while it is working as the call will simply be passed to a new child process.

But, don't think that Worker Farm is specific to that use-case, it's designed to be very generic and simple to adapt to anything requiring the use of child Node processes.

API

Worker Farm exports a main function and an end() method. The main function sets up a "farm" of coordinated child-process workers and it can be used to instantiate multiple farms, all operating independently.

workerFarm([options, ]pathToModule[, exportedMethods])

In its most basic form, you call workerFarm() with the path to a module file to be invoked by the child process. You should use an absolute path to the module file, the best way to obtain the path is with require.resolve('./path/to/module'), this function can be used in exactly the same way as require('./path/to/module') but it returns an absolute path.

exportedMethods

If your module exports a single function on module.exports then you should omit the final parameter. However, if you are exporting multiple functions on module.exports then you should list them in an Array of Strings:

var workers = workerFarm(require.resolve('./mod'), [ 'doSomething', 'doSomethingElse' ])
workers.doSomething(function () {})
workers.doSomethingElse(function () {})

Listing the available methods will instruct Worker Farm what API to provide you with on the returned object. If you don't list a exportedMethods Array then you'll get a single callable function to use; but if you list the available methods then you'll get an object with callable functions by those names.

It is assumed that each function you call on your child module will take a callback function as the last argument.

options

If you don't provide an options object then the following defaults will be used:

{
    workerOptions               : {}
  , maxCallsPerWorker           : Infinity
  , maxConcurrentWorkers        : require('os').cpus().length
  , maxConcurrentCallsPerWorker : 10
  , maxConcurrentCalls          : Infinity
  , maxCallTime                 : Infinity
  , maxRetries                  : Infinity
  , autoStart                   : false
  , onChild                     : function() {}
}
  • workerOptions allows you to customize all the parameters passed to child nodes. This object supports all possible options of child_process.fork. The default options passed are the parent execArgv, cwd and env. Any (or all) of them can be overridden, and others can be added as well.

  • maxCallsPerWorker allows you to control the lifespan of your child processes. A positive number will indicate that you only want each child to accept that many calls before it is terminated. This may be useful if you need to control memory leaks or similar in child processes.

  • maxConcurrentWorkers will set the number of child processes to maintain concurrently. By default it is set to the number of CPUs available on the current system, but it can be any reasonable number, including 1.

  • maxConcurrentCallsPerWorker allows you to control the concurrency of individual child processes. Calls are placed into a queue and farmed out to child processes according to the number of calls they are allowed to handle concurrently. It is arbitrarily set to 10 by default so that calls are shared relatively evenly across workers, however if your calls predictably take a similar amount of time then you could set it to Infinity and Worker Farm won't queue any calls but spread them evenly across child processes and let them go at it. If your calls aren't I/O bound then it won't matter what value you use here as the individual workers won't be able to execute more than a single call at a time.

  • maxConcurrentCalls allows you to control the maximum number of calls in the queue—either actively being processed or waiting for a worker to be processed. Infinity indicates no limit but if you have conditions that may endlessly queue jobs and you need to set a limit then provide a >0 value and any calls that push the limit will return on their callback with a MaxConcurrentCallsError error (check err.type == 'MaxConcurrentCallsError').

  • maxCallTime (use with caution, understand what this does before you use it!) when !== Infinity, will cap a time, in milliseconds, that any single call can take to execute in a worker. If this time limit is exceeded by just a single call then the worker running that call will be killed and any calls running on that worker will have their callbacks returned with a TimeoutError (check err.type == 'TimeoutError'). If you are running with maxConcurrentCallsPerWorker value greater than 1 then all calls currently executing will fail and will be automatically resubmitted unless you've changed the maxRetries option. Use this if you have jobs that may potentially end in infinite loops that you can't programatically end with your child code. Preferably run this with a maxConcurrentCallsPerWorker so you don't interrupt other calls when you have a timeout. This timeout operates on a per-call basis but will interrupt a whole worker.

  • maxRetries allows you to control the max number of call requeues after worker termination (unexpected or timeout). By default this option is set to Infinity which means that each call of each terminated worker will always be auto requeued. When the number of retries exceeds maxRetries value, the job callback will be executed with a ProcessTerminatedError. Note that if you are running with finite maxCallTime and maxConcurrentCallsPerWorkers greater than 1 then any TimeoutError will increase the retries counter for each concurrent call of the terminated worker.

  • autoStart when set to true will start the workers as early as possible. Use this when your workers have to do expensive initialization. That way they'll be ready when the first request comes through.

  • onChild when new child process starts this callback will be called with subprocess object as an argument. Use this when you need to add some custom communication with child processes.

workerFarm.end(farm)

Child processes stay alive waiting for jobs indefinitely and your farm manager will stay alive managing its workers, so if you need it to stop then you have to do so explicitly. If you send your farm API to workerFarm.end() then it'll cleanly end your worker processes. Note though that it's a soft ending so it'll wait for child processes to finish what they are working on before asking them to die.

Any calls that are queued and not yet being handled by a child process will be discarded. end() only waits for those currently in progress.

Once you end a farm, it won't handle any more calls, so don't even try!

Related

  • farm-cli – Launch a farm of workers from CLI.

License

Worker Farm is Copyright (c) Rod Vagg and licensed under the MIT license. All rights not explicitly granted in the MIT license are reserved. See the included LICENSE.md file for more details.

More Repositories

1

through2

Tiny wrapper around Node streams2 Transform to avoid explicit subclassing noise
JavaScript
1,894
star
2

github-webhook-handler

Node.js web handler / middleware for processing GitHub Webhooks
JavaScript
783
star
3

bl

Buffer List: collect buffers and access with a standard readable Buffer interface, streamable too!
JavaScript
420
star
4

bole

A tiny JSON logger
JavaScript
265
star
5

nodei.co

nodei.co - Node.js badges, that's all
JavaScript
258
star
6

archived-morkdown

A simple Markdown editor
JavaScript
245
star
7

node-errno

libuv errno details exposed
JavaScript
244
star
8

archived-dnt

Docker Node Tester
Shell
222
star
9

ghauth

Create and load persistent GitHub authentication tokens for command-line apps
JavaScript
184
star
10

archived-node-libssh

A Low-level Node.js binding for libssh
C++
132
star
11

archived-traversty

Headache-free DOM collection management and traversal
JavaScript
131
star
12

github-webhook

A flexible web server for reacting GitHub Webhooks
JavaScript
114
star
13

archived-node-pygmentize-bundled

A simple wrapper around Python's Pygments code formatter, with Pygments bundled
HTML
95
star
14

archived-lmdb

C++
85
star
15

jsonist

JSON over HTTP: A simple wrapper around hyperquest for dealing with JSON web APIs
JavaScript
66
star
16

isstream

Determine if an object is a Node.js Stream
JavaScript
63
star
17

polendina

Non-UI browser testing for JavaScript libraries from the command-line
JavaScript
63
star
18

archived-CAPSLOCKSCRIPT

JAVASCRIPT: T-H-E L-O-U-D P-A-R-T-S
JavaScript
60
star
19

archived-gfm2html

Convert a GitHub style Markdown file to HTML, complete with inline CSS
CSS
49
star
20

archived-node-level-session

A very fast and persistent web server session manager backed by LevelDB
JavaScript
49
star
21

cborg

fast CBOR with a focus on strictness
JavaScript
44
star
22

csv2

A Node Streams2 CSV parser
JavaScript
38
star
23

archived-pangyp

Node.js and io.js native addon build tool a (hopefully temporary) fork of TooTallNate/node-gyp
Python
38
star
24

archived-tsml

ES6 template string tag for multi-line cleaning - squash multi-line strings into a single line
JavaScript
37
star
25

archived-node-level-mapped-index

JavaScript
35
star
26

archived-node-rsz

An image resizer for Node.js
JavaScript
34
star
27

iamap

An Immutable Asynchronous Map
JavaScript
31
star
28

archived-servertest

A simple HTTP server testing tool
JavaScript
30
star
29

node-du

A simple JavaScript implementation of `du -sb`
JavaScript
29
star
30

archived-node-brucedown

A near-perfect GitHub style Markdown to HTML converter
JavaScript
29
star
31

rpi-newer-crosstools

Newer cross-compiler toolchains than are available @ https://github.com/raspberrypi/tools
C++
28
star
32

list-stream

Collect chunks / objects from a readable stream, write obejcts / chunks to a writable stream
JavaScript
27
star
33

archived-prr

JavaScript
26
star
34

archived-npm-explicit-deps

Say goodbye to fickle `~` and `^` semver ranges
JavaScript
26
star
35

ghissues

A node library to interact with the GitHub issues API
JavaScript
25
star
36

archived-string_decoder

Moved to https://github.com/nodejs/string_decoder
23
star
37

archived-node-sz

A Node.js utility for determining the dimensions of an image
JavaScript
23
star
38

js-ipld-hashmap

An associative array Map-type data structure for very large, distributed data sets built on IPLD
JavaScript
23
star
39

delayed

A collection of JavaScript helper functions for your functions, using setTimeout() to delay and defer.
JavaScript
22
star
40

archived-npm-publish-stream

A Node.js ReadableStream that emits data for each module published to npm
JavaScript
21
star
41

ghutils

A collection of utility functions for dealing with the GitHub API
JavaScript
20
star
42

archived-node-require-subvert

Yet another `require()` subversion library for mocking & stubbing
JavaScript
19
star
43

archived-level-ttl-cache

A pass-through cache for arbitrary objects or binary data using LevelDB, expired by a TTL
JavaScript
18
star
44

archived-level-spaces

Namespaced LevelUP instances
JavaScript
18
star
45

archived-node-generic-session

A generic web server session manager for use with any storage back-end
JavaScript
18
star
46

node-boganipsum

Node.js Lorem Ipsum ... Bogan Style!
JavaScript
17
star
47

archived-externr

Provide a plug-in mechanism for your JavaScript objects, exposing their inmost secrets
JavaScript
17
star
48

archived-npm-publish-notify

Desktop notifications on npm publish events
JavaScript
15
star
49

archived-new-contributors

Check a GitHub repository for new contributors
JavaScript
15
star
50

archived-iojs-tools

A collection of utilities I use to help with managing io.js business
HTML
15
star
51

archived-blorg

Flexible static blog generator
JavaScript
15
star
52

archived-node-simple-bufferstream

Turn a Node.js Buffer into a ReadableStream
JavaScript
14
star
53

archived-node-slow-stream

A throttleable stream, for working in the slow-lane
JavaScript
13
star
54

archived-node-crp

An image cropper for Node.js
JavaScript
13
star
55

archived-brtapsauce

Browserify TAP test runner for SauceLabs
JavaScript
12
star
56

archived-node-thmb

An image thumbnailer for Node.js
JavaScript
12
star
57

npm-download-counts

Fetch package download counts for packages from the npm registry
JavaScript
12
star
58

archived-nodei.co-chrome

Chrome extension to display nodei.co npm badges on GitHub README files for Node.js packages
JavaScript
11
star
59

ghrepos

A node library to interact with the GitHub repos API
JavaScript
11
star
60

archived-level-updown

LevelDOWN backed by LevelUP
JavaScript
11
star
61

archived-node-level-multiply

Make your LevelUP get(), put() and del() accept multiples keys & values.
JavaScript
11
star
62

ghteams

Node library to interact with the GitHub teams API
JavaScript
10
star
63

ghusers

A node library to interact with the GitHub users API
JavaScript
10
star
64

js-datastore-zipcar

An implementation of a Datastore (https://github.com/ipfs/interface-datastore) for IPLD blocks that operates on ZIP files
JavaScript
9
star
65

archived-bustermove

JavaScript
9
star
66

node-version-data

Load all Node.js and io.js versions and metadata about them
JavaScript
8
star
67

node-fd

File descriptor manager
JavaScript
8
star
68

archived-sanever

A saner semver parser
JavaScript
7
star
69

js-bitcoin-block

A Bitcoin block interface and decoder for JavaScript
JavaScript
7
star
70

ghpulls

A node library to interact with the GitHub pull requests API
JavaScript
7
star
71

testmark.js

Language-agnostic test fixtures in Markdown
JavaScript
6
star
72

campjs-2013-learn-you-node

CSS
5
star
73

archived-quantities

JavaScript library for physical quantity representation and conversion
JavaScript
5
star
74

jsdoc4readme

Generate an API section for a README.md from inline JSDocs
JavaScript
5
star
75

archived-package-use

Use the nodei.co Node.js package download count API to create CSV data on package use
JavaScript
5
star
76

archived-node-ssbl

Super-simple blog loader. Load markdown formatted blog files from a folder as a handy data structure for rendering
JavaScript
5
star
77

mkfiletree

Make a tree of files and directories by from data defined in an object
JavaScript
5
star
78

readfiletree

Deserialize an file/directory tree into object form
JavaScript
4
star
79

archived-check-python

Check for Python on the current system and return the value
JavaScript
4
star
80

archived-colors-tmpl

Simple templating for applying colors.js to strings
JavaScript
4
star
81

bit-sequence

Turn an arbitrary sequence of bits from a byte array and turn it into an integer
JavaScript
4
star
82

nodei.co-npm-dl-api

API server to manage the npm downloads counts and rankings for nodei.co
JavaScript
3
star
83

archived-node-downer-rangedel

A native LevelDOWN plugin providing a rangeDel() method
JavaScript
3
star
84

blake2-node

Node.js BLAKE2 addon
C
3
star
85

js-ipld-schema-describer

Provide an object that suits the Data Model and get a naive IPLD Schema description of it.
JavaScript
3
star
86

nodei.co-pkginfo-api

API server to manage the npm package info data for nodei.co
JavaScript
3
star
87

bsplit2

A Node.js binary transform stream splitting chunks by newline characters
JavaScript
3
star
88

archived-npm-download-db

A local store containing npm download counts for all packages, able to provide rankings
JavaScript
3
star
89

gitexec

A specialised child process spawn for `git` commands
JavaScript
3
star
90

js-fil-utils

Miscellaneous JavaScript Filecoin proofs utilities
JavaScript
3
star
91

archived-kappa-bridge

A bridge for certificate-authenticated npm connections to Kappa registries
JavaScript
3
star
92

node-ci-containers

Dockerfile
2
star
93

r.va.gg

HTML
2
star
94

lxjs2013

JavaScript Databases II
CSS
2
star
95

archived-simpledb2level

Extract complete (or partial / incremental) SimpleDB data to a local LevelDB
JavaScript
2
star
96

iavector

An Immutable Asynchronous Vector
JavaScript
2
star
97

js-bitcoin-extract

Tools to work with the Bitcoin blockchain (and IPLD)
JavaScript
2
star
98

kasm

A WASM thing in Rust that's probably not what you're looking for
Rust
2
star
99

js-ipld-schema-validator

Build fast and strict JavaScript object form validators using IPLD Schemas
JavaScript
2
star
100

js-ipld-vector

A JavaScript implementation of the IPLD Vetor specification
JavaScript
2
star