• Stars
    star
    100
  • Rank 340,703 (Top 7 %)
  • Language
    JavaScript
  • License
    Other
  • Created about 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Random access unzip library for JavaScript

unzipit.js

Random access unzip library for browser and node based JavaScript

Build Status [Live Tests]

How to use

without workers

import {unzip} from 'unzipit';

async function readFiles(url) {
  const {entries} = await unzip(url);

  // print all entries and their sizes
  for (const [name, entry] of Object.entries(entries)) {
    console.log(name, entry.size);
  }

  // read an entry as an ArrayBuffer
  const arrayBuffer = await entries['path/to/file'].arrayBuffer();

  // read an entry as a blob and tag it with mime type 'image/png'
  const blob = await entries['path/to/otherFile'].blob('image/png');
}

with workers

import {unzip, setOptions} from 'unzipit';

setOptions({workerURL: 'path/to/unzipit-worker.module.js'});

async function readFiles(url) {
  const {entries} = await unzip(url);
  ...
}

or if you prefer

import * as unzipit from 'unzipit';

unzipit.setOptions({workerURL: 'path/to/unzipit-worker.module.js'});

async function readFiles(url) {
  const {entries} = await unzipit.unzip(url);
  ...
}

In Parallel

import {unzip, setOptions} from 'unzipit';

setOptions({workerURL: 'path/to/unzipit-worker.module.js'});

async function readFiles(url) {
  const {entries} = await unzipit.unzip(url);
  const names = Object.keys(entries);
  const blobs = await Promise.all(Object.values(entries).map(e => e.blob()));

  // names and blobs are now parallel arrays so do whatever you want.
  const blobsByName = Object.fromEntries(names.map((name, i) => [name, blobs[i]]));
}

You can also pass a Blob, ArrayBuffer, SharedArrayBuffer, TypedArray, or your own Reader

For using without a builder/bundler grab unzipit.min.js or unzipit.module.js from the dist folder and include with

import * as unzipit from `./unzipit.module.js`;

or

<script src="unzipit.min.js"></script>

or vs CDN

import * as unzipit from 'https://unpkg.com/[email protected]/dist/unzipit.module.js';

or

<script src="https://unpkg.com/[email protected]/dist/unzipit.js"></script>

Node

For node you need to make your own Reader or pass in an ArrayBuffer, SharedArrayBuffer, or TypedArray.

Load a file as an ArrayBuffer

const unzipit = require('unzipit');
const fsPromises = require('fs').promises;

async function readFiles(filename) {
  const buf = await fsPromises.readFile(filename);
  const {zip, entries} = await unzipit.unzip(new Uint8Array(buf));
  ... (see code above)
}

You can also pass your own reader. Here's 2 examples. This first one is stateless. That means there is never anything to clean up. But, it has the overhead of opening the source file once for each time you get the contents of an entry. I have no idea what the overhead of that is.

const unzipit = require('unzipit');
const fsPromises = require('fs').promises;

class StatelessFileReader {
  constructor(filename) {
    this.filename = filename;
  }
  async getLength() {
    if (this.length === undefined) {
      const stat = await fsPromises.stat(this.filename);
      this.length = stat.size;
    }
    return this.length;
  }
  async read(offset, length) {
    const fh = await fsPromises.open(this.filename);
    const data = new Uint8Array(length);
    await fh.read(data, 0, length, offset);
    await fh.close();
    return data;
  }
}

async function readFiles(filename) {
  const reader = new StatelessFileReader(filename);
  const {zip, entries} = await unzipit.unzip(reader);
  ... (see code above)
}

Here's also an example of one that only opens the file a single time but that means the file stays open until you manually call close.

class FileReader {
  constructor(filename) {
    this.fhp = fsPromises.open(filename);
  }
  async close() {
    const fh = await this.fhp;
    await fh.close();
  }
  async getLength() {
    if (this.length === undefined) {
      const fh = await this.fhp;
      const stat = await fh.stat();
      this.length = stat.size;
    }
    return this.length;
  }
  async read(offset, length) {
    const fh = await this.fhp;
    const data = new Uint8Array(length);
    await fh.read(data, 0, length, offset);
    return data;
  }
}

async function doStuff() {
  // ...

  const reader = new FileReader(filename);
  const {zip, entries} = await unzipit.unzip(reader);

  // ... do stuff with entries ...

  // you must call reader.close for the file to close
  await reader.close();
}

Workers in Node

const unzipit = require('unzipit');

unzipit.setOptions({workerURL: require.resolve('unzipit/dist/unzipit-worker.js')});

...

// Only if you need node to exit you need to shut down the workers.
unzipit.cleanup();

Why?

Most of the js libraries I looked at would decompress all files in the zip file. That's probably the most common use case but it didn't fit my needs. I needed to, as fast as possible, open a zip and read a specific file. The better libraries only worked on node, I needed a browser based solution for Electron.

Note that to repo the behavior of most unzip libs would just be

import {unzip} from 'unzipit';

async function readFiles(url) {
  const {entries} = await unzip(url);
  await Promise.all(Object.values(entries).map(async(entry) => {
    entry.data = await entry.arrayBuffer();
  }));
}

One other thing is that many libraries seem bloated. IMO the smaller the API the better. I don't need a library to try to do 50 things via options and configuration. Rather I need a library to handle the main task and make it possible to do the rest outside the library. This makes a library far more flexible.

As an example some libraries provide no raw data for filenames. Apparently some zip files have non-utf8 filenames in them. The solution for this library is to do that on your own.

Example

const {zip, entriesArray} = await unzipit.unzipRaw(url);
// decode names as big5 (chinese)
const decoder = new TextDecoder('big5');
entriesArray.forEach(entry => {
  entry.name = decoder.decode(entry.nameBytes);
});
const entries = Object.fromEntries(entriesArray.map(v => [v.name, v]));
... // same as above beyond this point

Same thing with filenames. If you care about slashes or backslashes do that yourself outside the library

const {entries} = await unzipit(url);
// change slashes and backslashes into '-'
entries.forEach(entry => {
  entry.name = entry.name.replace(/\\|\//g, '-');
});

Some libraries both zip and unzip. IMO those should be separate libraries as there is little if any code to share between both. Plenty of projects only need to do one or the other.

Similarly inflate and deflate libraries should be separate from zip, unzip libraries. You need one or the other not both. See zlib as an example.

This library is ES6 based using async/await and import which makes the code much simpler.

Advantages over other libraries.

  • JSZIP requires the entire compressed file in memory. It also requires reading through all entries in order.

  • UZIP requires the entire compressed file to be in memory and the entire uncompressed contents of all the files to be in memory.

  • Yauzl does not require all the files to be in memory but they do have to be read in order and it has very peculiar API where you still have to manually go through all the entries even if you don't choose to read their contents. Further it's node only.

  • fflate has 2 modes. One the entire contents of all uncompressed files are provided therefore using lots of memory. The other is like Yauzl where you're required to handle every file but you can choose to ignore certain ones. Further in this mode (maybe both modes) are not standards compliant. It scans for files but that is not a valid way to read a zip file. The only valid way to read a zip file is to jump to the end of the file and find the table of contents. So, fflate will fail on perfectly valid zip files.

Unzipit does not require all compressed content nor all uncompressed content to be in memory. Only the entries you access use memory. If you use a Blob as input the browser can effectively virtualize access so it doesn't have to be in memory and unzipit will only access the parts of the blob needed to read the content you request.

Further, if you use the HTTPRangeReader or similar, unzipit only downloads/reads the parts of the zip file you actually use, saving you bandwidth.

As well, if you only need the data for images or video or audio then you can do things like

const {entries} = await unzip(url);
const blob = await entries['/some/image.jpg'].blob('image/jpeg');
const url = URL.createObjectURL(blob);
const img = new Image();
img.src = url;

Notice there is no access to the data using Blobs which means the browser manages them. They don't count as part of the JavaScript heap.

In node, the examples with the file readers will only read the header and whatever entries' contents you ask for so similarly you can avoid having everything in memory except the things you read.

API

import { unzipit, unzipitRaw, setOptions, cleanup } from 'unzipit';

unzip, unzipRaw

async unzip(url: string): ZipInfo
async unzip(src: Blob): ZipInfo
async unzip(src: TypedArray): ZipInfo
async unzip(src: ArrayBuffer): ZipInfo
async unzip(src: Reader): ZipInfo

async unzipRaw(url: string): ZipInfoRaw
async unzipRaw(src: Blob): ZipInfoRaw
async unzipRaw(src: TypedArray): ZipInfoRaw
async unzipRaw(src: ArrayBuffer): ZipInfoRaw
async unzipRaw(src: Reader): ZipInfoRaw

unzip and unzipRaw are async functions that take a url, Blob, TypedArray, or ArrayBuffer or a Reader. Both functions return an object with fields zip and entries. The difference is with unzip the entries is an object mapping filenames to ZipEntrys where as unzipRaw it's an array of ZipEntrys. The reason to use unzipRaw over unzip is if the filenames are not utf8 then the library can't make an object from the names. In that case you get an array of entries, use entry.nameBytes and decode the names as you please.

type ZipInfo = {
  zip: Zip,
  entries: {[key: string]: ZipEntry},
};
type ZipInfoRaw = {
  zip: Zip,
  entries: [ZipEntry],
};
class Zip {
  comment: string,           // the comment for the zip file
  commentBytes: Uint8Array,  // the raw data for comment, see nameBytes
}
class ZipEntry {
  async blob(type?: string): Blob,  // returns a Blob for this entry
                                    //  (optional type as in 'image/jpeg')
  async arrayBuffer(): ArrayBuffer, // returns an ArrayBuffer for this entry
  async text(): string,             // returns text, assumes the text is valid utf8.
                                    // If you want more options decode arrayBuffer yourself
  async json(): any,                // returns text with JSON.parse called on it.
                                    // If you want more options decode arrayBuffer yourself
  name: string,                     // name of entry
  nameBytes: Uint8Array,            // raw name of entry (see notes)
  size: number,                     // size in bytes
  compressedSize: number,           // size before decompressing
  comment: string,                  // the comment for this entry
  commentBytes: Uint8Array,         // the raw comment for this entry
  lastModDate: Date,                // a Date
  isDirectory: bool,                // True if directory
  encrypted: bool,                  // True if encrypted
  externalFileAttributes: number,   // platform specific file attributes
  versionMadeBy: number,            // platform that made this file
}
interface Reader {
  async getLength(): number,
  async read(offset, size): Uint8Array,
}

setOptions

setOptions(options: UnzipitOptions)

The options are

  • useWorkers: true/false (default: false)

  • workerURL: string

    The URL to use to load the worker script. Note setting this automatically sets useWorkers to true

  • numWorkers: number (default: 1)

    How many workers to use. You can inflate more files in parallel with more workers.

cleanup

cleanup()

Shuts down the workers. You would only need to call this if you want node to exit since it will wait for the workers to exit.

Notes:

Supporting old browsers

Use a transpiler like Babel.

Caching

If you ask for the same entry twice it will be read twice and decompressed twice. If you want to cache entires implement that at a level above unzipit

Streaming

You can't stream zip files. The only valid way to read a zip file is to read the central directory which is at the end of the zip file. Sure there are zip files where you can cheat and read the local headers of each file but that is an invalid way to read a zip file and it's trivial to create zip files that will fail when read that way but are perfectly valid zip files.

If your server supports http range requests you can do this.

import {unzip, HTTPRangeReader} from 'unzipit';

async function readFiles(url) {
  const reader = new HTTPRangeReader(url);
  const {zip, entries} = await unzip(reader);
  // ... access the entries as normal
}

Special headers and options for network requests

The library takes a URL but there are no options for cors, or credentials etc. If you need that pass in a Blob or ArrayBuffer you fetched yourself.

import {unzip} from 'unzipit';

...

const req = await fetch(url, { mode: 'cors' });
const blob = await req.blob();
const {entries} = await unzip(blob);

Non UTF-8 Filenames

The zip standard predates unicode so it's possible and apparently not uncommon for files to have non-unicode names. entry.nameBytes contains the raw bytes of the filename. so you are free to decode the name using your own methods. See example above.

Filename issues in general.

unzipit doesn't and can't know if a filename is valid for your use case. A zip file can have any name with any characters in the filename data. All unzipit can do is give you the filename as a string from the zip file. It's up to you do deal with it, for example to strip out or replace characters in the filename that are incompatible with your OS. For example this zip file has these filenames: 'this#file\\name%is&iffy', '???And This one???', 'fo:oo' which I believe are problematic on Windows. A user found a file with double slashes as in foo//bar so you'll need to decide what to do with that.

There is also the issue a user could make a malicious filename. For example "../../.bash_profile" on the hope that some program doesn't check the names and just uses the paths as is. If you're going to use unzipit to create files you should check and sanitize your paths.

ArrayBuffer and SharedArrayBuffer caveats

If you pass in an ArrayBuffer or SharedArrayBuffer you need to keep the data unchanged until you're finished using the data. The library doesn't make a copy, it uses the buffer directly.

Handling giant entries

There is no way for the library to know what "too large" means to you. The simple way to handle entries that are too large is to check their size before asking for their content.

  const kMaxSize = 1024*1024*1024*2;  // 2gig
  if (entry.size > kMaxSize) {
    throw new Error('this entry is larger than your max supported size');
  }
  const data = await entry.arrayBuffer();
  ...

Encrypted, Password protected Files

unzipit does not currently support encrypted zip files and will throw if you try to get the data for one. Put it on the TODO list 😅

File Attributes

If you want to make an unzip utilitiy using this library you'll need to be able to mark some files as executable. That is unforutunately platform specific. For example, Windows has no concept of "mark a file as executable". Each zip entry provides a versionMadeBy and externalFileAttributes property. You could theoretically use that to set file attributes. For example

fs.writeFileSync(filename, data);
if (process.platform === 'darwin' || process.platform === 'linux') {
  const platform = entry.versionMadeBy >> 8;
  const unix = 3;
  const darwin = 13
  if (entry.versionMadeBy === unix || entry.versionMadeBy === darwin) {
    // no idea what's best here
    //                                                 +- owner read
    //                                                 |+- owner write
    //                                                 ||+- owner execute
    //                                                 |||+- group read
    //                                                 ||||+- group write
    //                                                 |||||+- group execute
    //                                                 ||||||+- other read
    //                                                 |||||||+- other write
    //                                                 ||||||||+- other execute
    //                                                 |||||||||
    //                                                 VVVVVVVVV
    let mod = (entry.externalFileAttributes >> 16) & 0b111111111;  // all the bits
    mod &= 0b111100100;   // remove write and executable from group and other?
    mod |= 0b110100100;   // add in owner R/W, group R, other R
    fs.chmodSync(filename, mod);
  }
}

Other Limitations

unzipit only supports the uncompressed and deflate compression algorithms. Other algorithms are defined in the zip spec but are uncommon.

Testing

When writing tests serve the folder with your favorite web server (recommend servez) then go to http://localhost:8080/test/ to easily re-run the tests. You can set a grep regular expression to only run certain tests http://localhost:8080/test/?grep=json. It's up to you to encode the regular expression for a URL. For example

encodeURIComponent('j(.*?)son')
"j(.*%3F)son"

so http://localhost:8080/test/?grep=j(.*%3F)son. The regular expression will be marked as case insensitive.

Of course you can also npm test to run the tests from the command line.

Debugging

Follow the instructions on testing but add ?timeout=0 to the URL as in http://localhost:8080/tests/?timeout=0

Live Browser Tests

https://greggman.github.io/unzipit/test/

Acknowledgements

  • The code is heavily based on yauzl
  • The code uses the es6 module version of uzip.js

Licence

MIT

More Repositories

1

twgl.js

A Tiny WebGL helper Library
JavaScript
2,451
star
2

better-unity-webgl-template

A better default template for Unity WebGL
HTML
630
star
3

HappyFunTimes

A System for creating 10-100+ player local games
JavaScript
371
star
4

html5bytebeat

Bytebeats in HTML5
JavaScript
369
star
5

webgl-memory

A library to track webgl-memory
JavaScript
306
star
6

tdl

A low-level WebGL library
JavaScript
280
star
7

ffmpegserver.js

Receives canvas frames from browser to generate video on the server. Compatible with CCapture.js
JavaScript
267
star
8

servez

A simple web server for local web development.
JavaScript
258
star
9

hsva-unity

A Hue Saturation Value adjustment shader for Unity. Useful for making lots of character colors.
GLSL
202
star
10

webgl-lint

Checks your WebGL usage for common issues
JavaScript
160
star
11

wgpu-matrix

Fast WebGPU 3d math library
JavaScript
129
star
12

virtual-webgl

Virtualize WebGL Contexts
JavaScript
105
star
13

unity-webgl-copy-and-paste

Support Copy and Paste in Unity WebGL
C#
90
star
14

doodles

Random JavaScript doodles
JavaScript
58
star
15

getuserimage-unity-webgl

How to ask the user for a photo in Unity-WebGL
C#
55
star
16

webgl-helpers

some tiny webgl scripts that might come in handy
JavaScript
54
star
17

webgpu-memory

Track your WebGPU memory usage
JavaScript
43
star
18

servez-cli

The cli version of servez
JavaScript
38
star
19

ImHUI

Experimental UI
TypeScript
34
star
20

webgl-capture

code to help make a reduced test case for WebGL by capturing the commands and generating a stand alone program
JavaScript
34
star
21

oes-vertex-array-object-polyfill

WebGL OES_vertex_array_object polyfill for GPUs/Drivers/Browsers that don't have it
JavaScript
33
star
22

hft-unity3d

Unity3D Libraries for HappyFunTimes
C#
31
star
23

requestanimationframe-fix.js

Fix for requestAnimationFrame with lots of elements
JavaScript
31
star
24

webgpu-utils

Some helpers for webgpu
JavaScript
30
star
25

youtube_chromecast_speed_hack

A way to play a youtube video on Chromecast with settable speed
HTML
28
star
26

pico-8-post-processing

post process pico-8
JavaScript
28
star
27

hft-tonde-iko

A Multi-Machine Platformer
JavaScript
24
star
28

react-split-it

A React Based Splitter
JavaScript
24
star
29

pixel-perfect.js

Display Image Pixel Perfect
HTML
22
star
30

gradient-editor

A Jquery based gradient editor
JavaScript
20
star
31

jsgist

A code playground that stores data as github gists
JavaScript
16
star
32

interval-timer

A Simple Interval Timer
JavaScript
16
star
33

html5-gamepad-test

HTML
15
star
34

webgpu-avoid-redundant-state-setting

Check for and avoid redundant state setting
JavaScript
15
star
35

webgl-canvas-2d

A minimal implementation of the canvas 2D API through WebGL
JavaScript
14
star
36

rockfall

Rockfall. A game where rocks fall
JavaScript
14
star
37

jsbenchit

A JavaScript benchmark static page website that uses gists to store benchmarks
JavaScript
14
star
38

fanfictionreader

Reads your fanfiction to you.
JavaScript
13
star
39

happyfuntimes.net

The HappyFunTimes.net code
JavaScript
10
star
40

dekapng

Make giant PNG files in the browser
TypeScript
10
star
41

hft-gamepad-api

Emulates the HTML5 Gamepad API using smartphones and HappyFunTimes
JavaScript
10
star
42

hft-boomboom

A happyfuntimes game with splosions
JavaScript
9
star
43

DeJson.NET

A simple serialization library from JSON to C# classes based on MiniJSON good for Unity3D
C#
9
star
44

oculus-anti-spy

Try top stop Facebook from spying on all Oculus Activity
JavaScript
8
star
45

uzip-module

An ES6 module version of UZIP.js
JavaScript
8
star
46

imageutils

A few image utils for in browser JavaScript
JavaScript
7
star
47

hft-unity-gamepad

A Generic HappyFunTimes Gamepad for Unity
JavaScript
7
star
48

MoPho-V

A Community Supported Movie and Photo Viewer
JavaScript
6
star
49

sharks-with-frickin-lasers

JavaScript
6
star
50

hft-clean

The simplest happyfuntimes example, no other scripts
JavaScript
6
star
51

webgpu-helpers

Small scripts useful when debugging or developing webgpu
JavaScript
6
star
52

octopus

JavaScript
5
star
53

audiostreamsource.js

Provides a streamed audio source for WebAudio across browsers
JavaScript
5
star
54

dump-all-the-shaders

A script you can add to dump all your shaders to the console.
JavaScript
4
star
55

epub-viewer

A simple epub viewer. Client side only
HTML
4
star
56

image-grid

A simple image-grid for displaying images um, in a grid in the browser
JavaScript
4
star
57

simple-new-tab-page

A simple new tab page extension
JavaScript
4
star
58

other-window-ipc

IPC between windows in Electron
JavaScript
4
star
59

macos-opengl-experiments

Simple OpenGL stuff on MacOS
Objective-C++
3
star
60

rest-url

Makes REST urls
JavaScript
3
star
61

aws-oauth-helper

An AWS Lambda function to handle the oauth client secret part of oauth
JavaScript
3
star
62

unity-load-mp3-at-runtime

example of loading mp3 at runtime
C#
3
star
63

soundcloud-audio-reactive-example

Soundcloud audio reactive example using new API
JavaScript
3
star
64

hft-unityvideofromunity

An example of sending WebCam video FROM unity to the controller (phones)
C#
3
star
65

muigui

baking
JavaScript
3
star
66

hft-syncthreejs

Shows syncing a three.js example across multiple machines using HappyFunTimes
JavaScript
3
star
67

fixallthetags

SO script to fix tags
JavaScript
3
star
68

hft-local

Run a HappyFunTimes game without HappyFunTimes (no networking .. sometimes good for demos)
JavaScript
3
star
69

screenshot-ftw

screenshot a window across OSes
C++
3
star
70

webgl-benchmarks

WebGL Benchmarks (NOT GPU BENCHMARKS!!!)
JavaScript
3
star
71

opengl-fundamentals

JavaScript
3
star
72

stackoverflow-getallanswers

Get all answers for a particular user (and all the questions for those answers)
Python
3
star
73

hft-unitysimple

The simplest Unity example for HappyFunTimes using C#
C#
3
star
74

jsgistrunner

JavaScript
2
star
75

fisheye-skybox-unity

Make a fisheye skybox shader in unity
ShaderLab
2
star
76

cssparse.net

CSS string to Unity3D Color parser
C#
2
star
77

u2b-ux

better youtube ux
JavaScript
2
star
78

hft-unity-character-select

A HappyFunTimes Unity example showing spawning different prefabs based on player character selection
C#
2
star
79

native-msg-box

Allows you to display a native MessageBox / Dialog from node.js
JavaScript
2
star
80

hft-simple

A simple example for HappyFunTimes
JavaScript
2
star
81

hft-jumpjump

The HappyFunTimes JumpJump Example Platformer
JavaScript
2
star
82

servez-lib

The server part of servez
JavaScript
2
star
83

check-all-the-errors

load all your pages, check for javascript errors
JavaScript
2
star
84

hft-unity-2-button-gamejam

A Unity HappyFunTimes Template for the Pico Pico Cafe 2 Button Gamejam
C#
2
star
85

hft-simple-no-electron

an example of using happyfuntimes without electron
JavaScript
2
star
86

eslint-plugin-one-variable-per-var

Enforce one variable declaration per var statement
JavaScript
2
star
87

ldcp

low dependencies cp for node
JavaScript
2
star
88

LUT-to-PNG

Convert a LUT or CUBE file to a PNG (for Unreal / Unity)
JavaScript
2
star
89

hft-powpow

A simple space shooter game for HappyFunTimes
JavaScript
1
star
90

hft-utils

Various JavaScript files shared among HappyFunTimes example games
JavaScript
1
star
91

vertexshaderart.org

vertexshaderart.org
1
star
92

hft-sync2d

Example showing syncing canvas 2d across machines using HappyFunTimes
JavaScript
1
star
93

dns-server

Automatically exported from code.google.com/p/dns-server
C++
1
star
94

bloom

look at the source
JavaScript
1
star
95

hft-unity-cardboard

Example of using HappyFunTimes with Unity and Google Cardbard
C#
1
star
96

webgpu-dev-extension

Explorational WebGPU Dev Extension
JavaScript
1
star
97

font-utils

Some font utils I wrote to generate fonts for gamemaker
C
1
star
98

jsbenchit-comments

just a point to host jsbenchit comments on another domain for security
HTML
1
star
99

hft-c

HappyFunTimes support for C / C++ based games
C++
1
star
100

hft-exe

HappyFunTimes executable creator
C
1
star