• Stars
    star
    391
  • Rank 109,324 (Top 3 %)
  • Language
    JavaScript
  • License
    GNU Affero Genera...
  • Created over 13 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Web proxy for evading internet censorship, and general-purpose Node.js library for proxying and rewriting remote webpages

unblocker

Unblocker was originally a web proxy for evading internet censorship, similar to CGIproxy / PHProxy / Glype but written in node.js. It's since morphed into a general-purpose library for proxying and rewriting remote webpages.

All data is processed and relayed to the client on the fly without unnecessary buffering, making unblocker one of the fastest web proxies available.

Node.js CI npm-version

The magic part

The script uses "pretty" urls which, besides looking pretty, allow links with relative paths to just work without modification. (E.g. <a href="path/to/file2.html"></a>)

In addition to this, links that are relative to the root (E.g. <a href="/path/to/file2.html"></a>) can be handled without modification by checking the referrer and 307 redirecting them to the proper location in the referring site. (Although the proxy does attempt to rewrite these links to avoid the redirect.)

Cookies are proxied by adjusting their path to include the proxy's URL, and a bit of extra work is done to ensure they remain intact when switching protocols or subdomains.

Limitations

Although the proxy works well for standard login forms and even most AJAX content, OAuth login forms and anything that uses postMessage (Google, Facebook, etc.) are not likely to work out of the box. This is not an insurmountable issue, but it's not one that I expect to have fixed in the near term.

More advanced websites, such as Roblox, Discord, YouTube*, Instagram, etc. do not currently work. At the moment, there is no timeframe for when these might be supported.

Patches are welcome, including both general-purpose improvements to go into the main library, and site-specific fixes to go in the examples folder.

Running the website on your computer

See https://github.com/nfriedly/nodeunblocker.com

Using unblocker as a library in your software

npm install --save unblocker

Unblocker exports an express-compatible API, so using in an express application is trivial:

var express = require('express')
var Unblocker = require('unblocker');
var app = express();
var unblocker = new Unblocker({prefix: '/proxy/'});

// this must be one of the first app.use() calls and must not be on a subdirectory to work properly
app.use(unblocker);

app.get('/', function(req, res) {
    //...
});

// the upgrade handler allows unblocker to proxy websockets
app.listen(process.env.PORT || 8080).on('upgrade', unblocker.onUpgrade);

See examples/simple/server.js for a complete example.

Usage without express is similarly easy, see examples/simple/server.js for an example.

Configuration

Unblocker supports the following configuration options, defaults are shown:

{
    prefix: '/proxy/',  // Path that the proxied URLs begin with. '/' is not recommended due to a few edge cases.
    host: null, // Host used in redirects (e.g `example.com` or `localhost:8080`). Default behavior is to determine this from the request headers.
    requestMiddleware: [], // Array of functions that perform extra processing on client requests before they are sent to the remote server. API is detailed below.
    responseMiddleware: [], // Array of functions that perform extra processing on remote responses before they are sent back to the client. API is detailed below.
    standardMiddleware: true, // Allows you to disable all built-in middleware if you need to perform advanced customization of requests or responses.
    clientScripts: true, // Injects JavaScript to force things like WebSockets and XMLHttpRequest to go through the proxy.
    processContentTypes: [ // All  built-in middleware that modifies the content of responses limits itself to these content-types.
        'text/html',
        'application/xml+xhtml',
        'application/xhtml+xml',
        'text/css'
    ],
    httpAgent: null, //override agent used to request http response from server. see https://nodejs.org/api/http.html#http_class_http_agent
    httpsAgent: null //override agent used to request https response from server. see https://nodejs.org/api/https.html#https_class_https_agent
}

Setting process.env.NODE_ENV='production' will enable more aggressive caching on the client scripts and potentially other optimizations in the future.

Custom Middleware

Unblocker "middleware" are small functions that allow you to inspect and modify requests and responses. The majority of Unblocker's internal logic is implimented as middleware, and it's possible to write custom middleware to augment or replace the built-in middleware.

Custom middleware should be a function that accepts a single data argument and runs synchronously.

To process request and response data, create a Transform Stream to perform the processing in chunks and pipe through this stream. (Example below.)

To respond directly to a request, add a function to config.requestMiddleware that handles the clientResponse (a standard http.ServerResponse when used directly, or a Express Response when used with Express. Once a response is sent, no further middleware will be executed for that request. (Example below.)

requestMiddleware

Data example:

{
    url: 'http://example.com/',
    clientRequest: {request},
    clientResponse: {response},
    headers: {
        //...
    },
    stream: {ReadableStream of data for PUT/POST requests, empty stream for other types}
}

requestMiddleware may inspect the headers, url, etc. It can modify headers, pipe PUT/POST data through a transform stream, or respond to the request directly. If you're using express, the request and response objects will have all of the usual express goodies. For example:

function validateRequest(data) {
    if (!data.url.match(/^https?:\/\/en.wikipedia.org\//)) {
        data.clientResponse.status(403).send('Wikipedia only.');
    }
}
var config = {
    requestMiddleware: [
        validateRequest
    ]
}

If any piece of middleware sends a response, no further middleware is run.

After all requestMiddleware has run, the request is forwarded to the remote server with the (potentially modified) url/headers/stream/etc.

responseMiddleware

responseMiddleware receives the same data object as the requestMiddleware, but the headers and stream fields are replaced with those of the remote server's response, and several new fields are added for the remote request and response:

Data example:

{
    url: 'http://example.com/',
    clientRequest: {request},
    clientResponse: {response},
    remoteRequest {request},
    remoteResponse: {response},
    contentType: 'text/html',
    headers: {
        //...
    },
    stream: {ReadableStream of response data}
}

For modifying content, create a new stream and then pipe data.stream to it and replace data.stream with it:

var Transform = require('stream').Transform;

function injectScript(data) {
    if (data.contentType == 'text/html') {

        // https://nodejs.org/api/stream.html#stream_transform
        var myStream = new Transform({
            decodeStrings: false,
            function(chunk, encoding, next) {
                chunk = chunk.toString.replace('</body>', '<script src="/my/script.js"></script></body>');
                this.push(chunk);
                next();
                }
        });

        data.stream = data.stream.pipe(myStream);
    }
}

var config = {
    responseMiddleware: [
        injectScript
    ]
}

See examples/nodeunblocker.com/app.js for another example of adding a bit of middleware. Also, see any of the built-in middleware in the lib/ folder.

Built-in Middleware

Most of the internal functionality of the proxy is also implemented as middleware:

  • host: Corrects the host header in outgoing responses
  • referer: Corrects the referer header in outgoing requests
  • cookies: Fixes the Path attribute of set-cookie headers to limit cookies to their "path" on the proxy (e.g. Path=/proxy/http://example.com/). Also injects redirects to copy cookies from between protocols and subdomains on a given domain.
  • hsts: Removes Strict-Transport-Security headers because they can leak to other sites and can break the proxy.
  • hpkp: Removes Public-Key-Pinning headers because they can leak to other sites and can break the proxy.
  • csp: Removes Content-Security-Policy headers because they can leak to other sites and can break the proxy.
  • redirects: Rewrites urls in 3xx redirects to ensure they go through the proxy
  • decompress: Decompresses Content-Encoding: gzip|deflate responses and also tweaks request headers to ask for either gzip-only or no compression at all. (It will attempt to decompress deflate content, but there are some issues, so it does not advertise support for deflate.)
  • charsets: Converts the charset of responses to UTF-8 for safe string processing in node.js. Determines charset from headers or meta tags and rewrites all headers and meta tags in outgoing response.
  • urlPrefixer: Rewrites URLS of links/images/css/etc. to ensure they go through the proxy
  • metaRobots: Injects a ROBOTS: NOINDEX, NOFOLLOW meta tag to prevent search engines from crawling the entire web through the proxy.
  • contentLength: Deletes the content-length header on responses if the body was modified.

Setting the standardMiddleware configuration option to false disables all built-in middleware, allowing you to selectively enable, configure, and re-order the built-in middleware.

This configuration would mimic the defaults:

var Unblocker = require('unblocker');

var config = {
    prefix: '/proxy/',
    host: null,
    requestMiddleware: [],
    responseMiddleware: [],
    standardMiddleware: false,  // disables all built-in middleware
    processContentTypes: [
        'text/html',
        'application/xml+xhtml',
        'application/xhtml+xml'
    ]
}

var host = Unblocker.host(config);
var referer = Unblocker.referer(config);
var cookies = Unblocker.cookies(config);
var hsts = Unblocker.hsts(config);
var hpkp = Unblocker.hpkp(config);
var csp = Unblocker.csp(config);
var redirects = Unblocker.redirects(config);
var decompress = Unblocker.decompress(config);
var charsets = Unblocker.charsets(config);
var urlPrefixer = Unblocker.urlPrefixer(config);
var metaRobots = Unblocker.metaRobots(config);
var contentLength = Unblocker.contentLength(config);

config.requestMiddleware = [
    host,
    referer,
    decompress.handleRequest,
    cookies.handleRequest
    // custom requestMiddleware here
];

config.responseMiddleware = [
    hsts,
    hpkp,
    csp,
    redirects,
    decompress.handleResponse,
    charsets,
    urlPrefixer,
    cookies.handleResponse,
    metaRobots,
    // custom responseMiddleware here
    contentLength
];

var unblocker = new Unblocker(config);
app.use(unblocker);

// ...

// the upgrade handler allows unblocker to proxy websockets
app.listen(process.env.PORT || 8080).on('upgrade', unblocker.onUpgrade);

Debugging

Unblocker is fully instrumented with debug. Enable debugging via environment variables:

DEBUG=unblocker:* node mycoolapp.js

There is also a middleware debugger that adds extra debugging middleware before and after each existing middleware function to report on changes. It's included with the default DEBUG activation and may also be selectively enabled:

DEBUG=unblocker:middleware node mycoolapp.js

... or disabled:

DEBUG=*,-unblocker:middleware node mycoolapp.js

Troubleshooting

If you're using Nginx as a reverse proxy, you probably need to disable merge_slashes to avoid endless redirects and/or other issues:

merge_slashes off;

Todo

  • Consider adding compress middleware to compress text-like responses
  • Un-prefix urls in GET / POST data
  • Inject js to proxy postMessage data and fix origins
  • More examples
  • Even more tests

AGPL-3.0 License

This project is released under the terms of the GNU Affero General Public License version 3.

All source code is copyright Nathan Friedly.

Commercial licensing and support are also available, contact Nathan Friedly ([email protected]) for details.

Contributors

More Repositories

1

set-cookie-parser

Parse HTTP set-cookie headers in JavaScript
JavaScript
154
star
2

nodeunblocker.com

Evade internet censorship!
HTML
150
star
3

Javascript-Flash-Cookies

Cross-domain flash cookie library for javascript. ~ 4kb total when JS is minified and gzipped.
JavaScript
107
star
4

node-bestzip

Provides a `bestzip` command that uses the system `zip` if avaliable, and a Node.js implimentation otherwise.
JavaScript
80
star
5

spam-free-php-contact-form

Simple, human-friendly contact form (no captchas). Uses JavaScript and hidden fields to thwart spammers.
HTML
70
star
6

Coin-Allocator

Bitcoin/Altcoin/USD trading bot. Was moderately profitable, until the exchange got hacked. No longer under active development.
JavaScript
49
star
7

DuckDuckGo-GoogleSuggest

Node JS server that proxies google suggest queries for the Duck Duck Go search box, and adds the !'s back on when google removes them.
HTML
40
star
8

facebook-js-sdk

Facebook's debug.js (what gets minified into sdk.js), updated every 10 minutes
JavaScript
39
star
9

approximate-number

Converts numbers into a more human-friendly format. E.g. 123456 becomes 123k. Similar to `ls -lh` or Stack Overflow's reputation numbers.
JavaScript
37
star
10

get-user-media-promise

Basic wrapper for navigator.mediaDevices.getUserMedia with automatic fallback to navigator.getUserMedia
JavaScript
22
star
11

node-pagerank

Node.js library for looking up the Google PageRank of a given site. No longer functional.
JavaScript
17
star
12

miyoo-toolchain

Dockerfile to build an image with the toolchain and other dependencies to compile software the Miyoo Custom Firmware (CFW)
Dockerfile
12
star
13

nodemcu-weather-station

Displays current weather conditions inside and out
Lua
12
star
14

nfriedly.com

My personal website. Contact info, portfolio, links, etc.
JavaScript
12
star
15

couchdb-backup-restore

Node.js library for simple backup and restore of CouchDB databases
JavaScript
12
star
16

node-gatling

A simple node.js script that turns a single-threaded server into a multi-threaded server with automatic restarting.
JavaScript
8
star
17

contentful-dictate

A UI Extension for Contentful that uses IBM Watson Speech to Text to enable voice dictation.
JavaScript
7
star
18

node-dreamhost-dns-updater

A quick script I build to set a given hostname to my current IP via Dreamhost's API
JavaScript
5
star
19

dog-food

Gadget that answers the question of "Did anyone feed the dog yet?"
Python
4
star
20

node-whats-my-ip

Simple text-based service to find your public IP. Can be run for free on heroku (and likely other similar services)
HTML
4
star
21

docpad-plugin-cloudant

Cloudant importer for DocPad (Cloudant is a hosted couchdb service)
CoffeeScript
3
star
22

docpad-plugin-mongodb

MongoDB importer for DocPad
CoffeeScript
3
star
23

gps.bb.tracking

Blackberry GPS tracker similar to google's android one.
Java
3
star
24

prefix-stream

Prepend each chunk in a node.js text stream with the given prefix
JavaScript
3
star
25

aplexa

Web App that shows what song the Plex skill for Alexa is currently playing
JavaScript
2
star
26

Meteor-ODB-II

A quick ODB-II (vehicle diagnostic code) search website built with Meteor
JavaScript
2
star
27

running-average

Memory-efficient module that tracks the average value of an unlimited quantity of numbers
JavaScript
2
star
28

vzw-bot

Bot that automatically logs into My Verizon, reports data usage, and can spend "smart rewards" points on avaliable sweepstakes.
JavaScript
2
star
29

space-jump

A cross between Lunar Lander and Doodle Jump
JavaScript
2
star
30

dreamhost

DreamHost API client for Node.js and Browsers
JavaScript
2
star
31

hn-avatars

UserScript to generate avatars next to usernames on Hacker News
JavaScript
2
star
32

docpad-plugin-redirector

DocPad plugin for redirecting URLs to other websites via configuration.
CoffeeScript
2
star
33

Stripe-CTF-2014-level2

Dynamic ingress filter to fight off a DDOS while allowing legitimate traffic through
JavaScript
1
star
34

eleventy-plugin-less

Plugin for eleventy (11ty) to convert Less stylesheets to CSS
JavaScript
1
star
35

wdc-deep-dive

Code from my Technical Deep Dive presentation on the IBM Watson Developer Cloud Node.js SDK
JavaScript
1
star
36

MacAdSense

An updated version of Kai 'Oswald' Seidler's MacAdSense dashboard widget
1
star
37

sweetspot

"Sweeps Bot" - Bot to help you remember & enter sweepstakes
JavaScript
1
star
38

aoc-2022-02

Advent of Code 2022 - Day 2
Rust
1
star
39

rss-xslt

A tool I built in college to add a custom XSLT theme to an arbitrary RSS feed. Moving it here for safe keeping.
PHP
1
star
40

ua.nfriedly.com

Static User Agent parser, previously at whatsmyua.com
HTML
1
star
41

contributor-locations

List the locations of all contributors to a GitHub repo.
JavaScript
1
star
42

aoc-2022-1

Advent of Code 2022 - Day 1
Rust
1
star
43

oled-saver-watchface

Watchface for Wear OS that randomly moves the time around to avoid burn-in on OLED displays
Java
1
star
44

JS-Mini-Shell

A Super-lightweight interactive JavaScript shell that fits into a bookmarklet
CSS
1
star
45

picsync-server

Accepts uploaded photos and stores them privately, allowing you to later review them and post your favorites to Facebook. Written for node.js
CSS
1
star
46

BiblePeople

RoR based website with details on people and family lines in the Bible
Ruby
1
star
47

grunt-swf

Compiles .as files to .swf via the Apache Flex SDK (free but must be installed seperately)
JavaScript
1
star
48

Arduino-Fan-Controler

Controls a whole-house fan for energy-efficient home automation
Arduino
1
star
49

puck.js-media-control

Remote control to pause, resume, and rewind audiobooks playing from my phone
JavaScript
1
star
50

aoc-2022

Advent of Code 2022
Rust
1
star
51

web-shell

Interactive command prompt for locked down app servers (such as Bluemix). Highly insecure.
JavaScript
1
star
52

value-averaging

A website to help make the value averaging investment strategy easier.
JavaScript
1
star
53

socket.io-example

Just a quick demo I put together
JavaScript
1
star
54

aoc-2022-03

Advent of Code 2022 - Day 3
Rust
1
star
55

ypool-xpm-miner-watcher

A node.js script to watch ypool.net's PrimeCoin jhPrimeminer and restart it every time it crashes
JavaScript
1
star