• Stars
    star
    381
  • Rank 112,502 (Top 3 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created almost 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

tar-stream is a streaming tar parser and generator.

tar-stream

tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.

Note that you still need to gunzip your data if you have a .tar.gz. We recommend using gunzip-maybe in conjunction with this.

npm install tar-stream

build status License

Usage

tar-stream exposes two streams, pack which creates tarballs and extract which extracts tarballs. To modify an existing tarball use both.

It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc)

Related

If you want to pack/unpack directories on the file system check out tar-fs which provides file system bindings to this module.

Packing

To create a pack stream use tar.pack() and call pack.entry(header, [callback]) to add tar entries.

const tar = require('tar-stream')
const pack = tar.pack() // pack is a stream

// add a file called my-test.txt with the content "Hello World!"
pack.entry({ name: 'my-test.txt' }, 'Hello World!')

// add a file called my-stream-test.txt from a stream
const entry = pack.entry({ name: 'my-stream-test.txt', size: 11 }, function(err) {
  // the stream was added
  // no more entries
  pack.finalize()
})

entry.write('hello')
entry.write(' ')
entry.write('world')
entry.end()

// pipe the pack stream somewhere
pack.pipe(process.stdout)

Extracting

To extract a stream use tar.extract() and listen for extract.on('entry', (header, stream, next) )

const extract = tar.extract()

extract.on('entry', function (header, stream, next) {
  // header is the tar header
  // stream is the content body (might be an empty stream)
  // call next when you are done with this entry

  stream.on('end', function () {
    next() // ready for next entry
  })

  stream.resume() // just auto drain the stream
})

extract.on('finish', function () {
  // all entries read
})

pack.pipe(extract)

The tar archive is streamed sequentially, meaning you must drain each entry's stream as you get them or else the main extract stream will receive backpressure and stop reading.

Extracting as an async iterator

The extraction stream in addition to being a writable stream is also an async iterator

const extract = tar.extract()

someStream.pipe(extract)

for await (const entry of extract) {
  entry.header // the tar header
  entry.resume() // the entry is the stream also
}

Headers

The header object using in entry should contain the following properties. Most of these values can be found by stat'ing a file.

{
  name: 'path/to/this/entry.txt',
  size: 1314,        // entry size. defaults to 0
  mode: 0o644,       // entry mode. defaults to to 0o755 for dirs and 0o644 otherwise
  mtime: new Date(), // last modified date for entry. defaults to now.
  type: 'file',      // type of entry. defaults to file. can be:
                     // file | link | symlink | directory | block-device
                     // character-device | fifo | contiguous-file
  linkname: 'path',  // linked file name
  uid: 0,            // uid of entry owner. defaults to 0
  gid: 0,            // gid of entry owner. defaults to 0
  uname: 'maf',      // uname of entry owner. defaults to null
  gname: 'staff',    // gname of entry owner. defaults to null
  devmajor: 0,       // device major version. defaults to 0
  devminor: 0        // device minor version. defaults to 0
}

Modifying existing tarballs

Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball.

const extract = tar.extract()
const pack = tar.pack()
const path = require('path')

extract.on('entry', function (header, stream, callback) {
  // let's prefix all names with 'tmp'
  header.name = path.join('tmp', header.name)
  // write the new entry to the pack stream
  stream.pipe(pack.entry(header, callback))
})

extract.on('finish', function () {
  // all entries done - lets finalize it
  pack.finalize()
})

// pipe the old tarball to the extractor
oldTarballStream.pipe(extract)

// pipe the new tarball the another stream
pack.pipe(newTarballStream)

Saving tarball to fs

const fs = require('fs')
const tar = require('tar-stream')

const pack = tar.pack() // pack is a stream
const path = 'YourTarBall.tar'
const yourTarball = fs.createWriteStream(path)

// add a file called YourFile.txt with the content "Hello World!"
pack.entry({ name: 'YourFile.txt' }, 'Hello World!', function (err) {
  if (err) throw err
  pack.finalize()
})

// pipe the pack stream to your file
pack.pipe(yourTarball)

yourTarball.on('close', function () {
  console.log(path + ' has been written')
  fs.stat(path, function(err, stats) {
    if (err) throw err
    console.log(stats)
    console.log('Got file info successfully!')
  })
})

Performance

See tar-fs for a performance comparison with node-tar

License

MIT

More Repositories

1

peerflix

Streaming torrent client for node.js
JavaScript
6,094
star
2

playback

Video player built using electron and node.js
JavaScript
2,009
star
3

torrent-stream

The low level streaming torrent engine that peerflix uses
JavaScript
1,941
star
4

why-is-node-running

Node is running but you don't know why? why-is-node-running is here to help you.
JavaScript
1,781
star
5

chromecasts

Query your local network for Chromecasts and have them play media
JavaScript
1,447
star
6

csv-parser

Streaming csv parser inspired by binary-csv that aims to be faster than everyone else
JavaScript
1,413
star
7

torrent-mount

Mount a torrent (or magnet link) as a filesystem in real time using torrent-stream and fuse. AKA MAD SCIENCE!
JavaScript
1,333
star
8

turbo-http

Blazing fast low level http server
JavaScript
996
star
9

is-my-json-valid

A JSONSchema validator that uses code generation to be extremely fast
JavaScript
961
star
10

pump

pipe streams together and close all of them if one of them closes
JavaScript
895
star
11

airpaste

A 1-1 network pipe that auto discovers other peers using mdns
JavaScript
819
star
12

hyperdb

Distributed scalable database
JavaScript
752
star
13

protocol-buffers

Protocol Buffers for Node.js
JavaScript
751
star
14

signalhub

Simple signalling server that can be used to coordinate handshaking with webrtc or other fun stuff.
JavaScript
667
star
15

turbo-json-parse

Turbocharged JSON.parse for type stable JSON data
JavaScript
613
star
16

turbo-net

Low level TCP library for Node.js
JavaScript
598
star
17

peercast

torrent-stream + chromecast
JavaScript
509
star
18

hyperbeam

A 1-1 end-to-end encrypted internet pipe powered by Hyperswarm
JavaScript
482
star
19

multicast-dns

Low level multicast-dns implementation in pure javascript
JavaScript
470
star
20

hyperlog

Merkle DAG that replicates based on scuttlebutt logs and causal linking
JavaScript
466
star
21

hypervision

P2P Television
JavaScript
445
star
22

webcat

Mad science p2p pipe across the web using webrtc that uses your Github private/public key for authentication and a signalhub for discovery
JavaScript
437
star
23

webrtc-swarm

Create a swarm of p2p connections using webrtc and a signalhub
JavaScript
375
star
24

discovery-swarm

A network swarm that uses discovery-channel to find peers
JavaScript
375
star
25

tar-fs

fs bindings for tar-stream
JavaScript
339
star
26

torrent-docker

MAD SCIENCE realtime boot of remote docker images using bittorrent
JavaScript
314
star
27

fuse-bindings

Notice: We published the successor module to this here https://github.com/fuse-friends/fuse-native
C++
312
star
28

peerwiki

all of wikipedia on bittorrent
JavaScript
308
star
29

awesome-p2p

List of great p2p resources
301
star
30

hyperfs

A content-addressable union file system build on top of fuse, hyperlog, leveldb and node
JavaScript
270
star
31

respawn

Spawn a process and restart it if it crashes
JavaScript
254
star
32

pumpify

Combine an array of streams into a single duplex stream using pump and duplexify
JavaScript
252
star
33

polo

Polo is a zero configuration service discovery module written completely in Javascript.
JavaScript
246
star
34

benny-hill

Play the Benny Hill theme while running another command
JavaScript
242
star
35

streamx

An iteration of the Node.js core streams with a series of improvements.
JavaScript
224
star
36

mp4-stream

Streaming mp4 encoder and decoder
JavaScript
219
star
37

hyperphone

A telephone over Hyperbeam
JavaScript
198
star
38

flat-file-db

Fast in-process flat file database that caches all data in memory
JavaScript
196
star
39

diffy

A tiny framework for building diff based interactive command line tools.
JavaScript
191
star
40

dns-discovery

Discovery peers in a distributed system using regular dns and multicast dns.
JavaScript
190
star
41

duplexify

Turn a writable and readable stream into a streams2 duplex stream with support for async initialization and streams1/streams2 input
JavaScript
185
star
42

browser-server

A HTTP "server" in the browser that uses a service worker to allow you to easily send back your own stream of data.
JavaScript
185
star
43

browserify-fs

fs for the browser using level-filesystem and browserify
JavaScript
184
star
44

ims

Install My Stuff - an opinionated npm module installer
JavaScript
184
star
45

dns-packet

An abstract-encoding compliant module for encoding / decoding DNS packets
JavaScript
181
star
46

jitson

Just-In-Time JSON.parse compiler
JavaScript
178
star
47

dnsjack

A simple DNS proxy that lets you intercept domains and route them to whatever IP you decide.
JavaScript
172
star
48

nanobench

Simple benchmarking tool with TAP-like output that is easy to parse
JavaScript
169
star
49

localcast

A shared event emitter that works across multiple processes on the same machine, including the browser!
JavaScript
165
star
50

level-filesystem

Full implementation of the fs module on top of leveldb
JavaScript
164
star
51

dht-rpc

Make RPC calls over a Kademlia based DHT.
JavaScript
160
star
52

tetris

Play tetris in your terminal - in color
JavaScript
157
star
53

hyperssh

Run SSH over hyperswarm!
JavaScript
146
star
54

end-of-stream

Call a callback when a readable/writable/duplex stream has completed or failed.
JavaScript
145
star
55

flat-tree

A series of functions to map a binary tree to a list
JavaScript
141
star
56

lil-pids

Dead simple process manager with few features
JavaScript
140
star
57

airswarm

Network swarm that automagically discovers other peers on the network using multicast dns
JavaScript
127
star
58

wat2js

Compile WebAssembly .wat files to a common js module
JavaScript
127
star
59

node-modules

Search for node modules
JavaScript
127
star
60

ssh-exec

Execute a script over ssh using Node.JS
JavaScript
126
star
61

add-to-systemd

Small command line tool to simply add a service to systemd
JavaScript
125
star
62

deejay

Music player that broadcasts to everyone on the same network
JavaScript
124
star
63

protocol-buffers-schema

No nonsense protocol buffers schema parser written in Javascript
JavaScript
120
star
64

tree-to-string

Convert a tree structure into a human friendly string
JavaScript
120
star
65

unordered-array-remove

Efficiently remove an element from an unordered array without doing a splice
JavaScript
117
star
66

hyperpipe

Distributed input/output pipe.
JavaScript
116
star
67

abstract-chunk-store

A test suite and interface you can use to implement a chunk based storage backend
JavaScript
113
star
68

shared-structs

Share a struct backed by the same underlying buffer between C and JavaScript
JavaScript
113
star
69

mininet

Spin up and interact with virtual networks using Mininet and Node.js
JavaScript
113
star
70

p2p-workshop

a workshop to learn about p2p
HTML
111
star
71

jsonkv

Single file write-once database that is valid JSON with efficient random access on bigger datasets
JavaScript
109
star
72

ansi-diff-stream

A transform stream that diffs input buffers and outputs the diff as ANSI. If you pipe this to a terminal it will update the output with minimal changes
JavaScript
109
star
73

browser-sync-stream

Rsync between a server and the browser.
JavaScript
108
star
74

docker-registry-server

docker registry server in node.js
JavaScript
108
star
75

dns-socket

Make custom low-level DNS requests from node with retry support.
JavaScript
102
star
76

utp-native

Native bindings for libutp
JavaScript
100
star
77

taco-nginx

Bash script that runs a service and forwards a subdomain to it using nginx when it listens to $PORT
Shell
100
star
78

gunzip-maybe

Transform stream that gunzips its input if it is gzipped and just echoes it if not
JavaScript
98
star
79

merkle-tree-stream

A stream that generates a merkle tree based on the incoming data.
JavaScript
98
star
80

media-recorder-stream

The Media Recorder API in the browser as a readable stream
JavaScript
97
star
81

thunky

Delay the evaluation of a paramless async function and cache the result
JavaScript
97
star
82

peervision

a live p2p streaming protocol
JavaScript
97
star
83

noise-network

Authenticated P2P network backed by Hyperswarm and Noise
JavaScript
96
star
84

soundcloud-to-dat

Download all music from a Soundcloud url and put it into a Dat
JavaScript
96
star
85

blecat

1-1 pipe over bluetooth low energy
JavaScript
95
star
86

debugment

A debug comment -> debugment
JavaScript
93
star
87

hyperdht

A DHT that supports peer discovery and distributed hole punching
JavaScript
93
star
88

docker-browser-console

Forward input/output from docker containers to your browser
JavaScript
90
star
89

srt-to-vtt

Transform stream that converts srt files to vtt files (html5 video subtitles)
JavaScript
89
star
90

speedometer

speed measurement in javascript
JavaScript
88
star
91

mutexify

Bike shed mutex lock implementation
JavaScript
88
star
92

p2p-file-sharing-workshop

A workshop where you learn about distributed file sharing
HTML
88
star
93

mirror-folder

Small module to mirror a folder to another folder. Supports live mode as well.
JavaScript
87
star
94

utp

utp (micro transport protocol) implementation in node
JavaScript
86
star
95

echo-servers.c

A collection of various echo servers in c
C
83
star
96

recursive-watch

Minimal recursive file watcher
JavaScript
82
star
97

docker-browser-server

Spawn and expose docker containers over http and websockets
JavaScript
80
star
98

are-feross-and-mafintosh-stuck-in-an-elevator

Are @feross and @mafintosh stuck in an elevator?
JavaScript
79
star
99

parallel-transform

Transform stream for Node.js that allows you to run your transforms in parallel without changing the order
JavaScript
79
star
100

peer-wire-swarm

swarm implementation for bittorrent
JavaScript
79
star