knox
Node Amazon S3 Client.
Features
- Familiar API (
client.get()
,client.put()
, etc.) - Very Node-like low-level request capabilities via
http.Client
- Higher-level API with
client.putStream()
,client.getFile()
, etc. - Copying and multi-file delete support
- Streaming file upload and direct stream-piping support
Examples
The following examples demonstrate some capabilities of knox and the S3 REST API. First things first, create an S3 client:
var client = knox.createClient({
key: '<api-key-here>'
, secret: '<secret-here>'
, bucket: 'learnboost'
});
More options are documented below for features like other endpoints or regions.
PUT
If you want to directly upload some strings to S3, you can use the Client#put
method with a string or buffer, just like you would for any http.Client
request. You pass in the filename as the first parameter, some headers for the
second, and then listen for a 'response'
event on the request. Then send the
request using req.end()
. If we get a 200 response, great!
If you send a string, set
Content-Length
to the length of the buffer of your string, rather than of the string itself.
var object = { foo: "bar" };
var string = JSON.stringify(object);
var req = client.put('/test/obj.json', {
'Content-Length': Buffer.byteLength(string)
, 'Content-Type': 'application/json'
});
req.on('response', function(res){
if (200 == res.statusCode) {
console.log('saved to %s', req.url);
}
});
req.end(string);
By default the x-amz-acl header is private. To alter this simply pass this header to the client request method.
client.put('/test/obj.json', { 'x-amz-acl': 'public-read' });
Each HTTP verb has an alternate method with the "File" suffix, for example
put()
also has a higher level method named putFile()
, accepting a source
filename and performing the dirty work shown above for you. Here is an example
usage:
client.putFile('my.json', '/user.json', function(err, res){
// Always either do something with `res` or at least call `res.resume()`.
});
Another alternative is to stream via Client#putStream()
, for example:
http.get('http://google.com/doodle.png', function(res){
var headers = {
'Content-Length': res.headers['content-length']
, 'Content-Type': res.headers['content-type']
};
client.putStream(res, '/doodle.png', headers, function(err, res){
// check `err`, then do `res.pipe(..)` or `res.resume()` or whatever.
});
});
You can also use your stream's pipe
method to pipe to the PUT request, but
you'll still have to set the 'Content-Length'
header. For example:
fs.stat('./Readme.md', function(err, stat){
// Be sure to handle `err`.
var req = client.put('/Readme.md', {
'Content-Length': stat.size
, 'Content-Type': 'text/plain'
});
fs.createReadStream('./Readme.md').pipe(req);
req.on('response', function(res){
// ...
});
});
Finally, if you want a nice interface for putting a buffer or a string of data,
use Client#putBuffer()
:
var buffer = new Buffer('a string of data');
var headers = {
'Content-Type': 'text/plain'
};
client.putBuffer(buffer, '/string.txt', headers, function(err, res){
// ...
});
Note that both putFile
and putStream
will stream to S3 instead of reading
into memory, which is great. And they return objects that emit 'progress'
events too, so you can monitor how the streaming goes! The progress events have
fields written
, total
, and percent
.
GET
Below is an example GET request on the file we just shoved at S3. It simply outputs the response status code, headers, and body.
client.get('/test/Readme.md').on('response', function(res){
console.log(res.statusCode);
console.log(res.headers);
res.setEncoding('utf8');
res.on('data', function(chunk){
console.log(chunk);
});
}).end();
There is also Client#getFile()
which uses a callback pattern instead of giving
you the raw request:
client.getFile('/test/Readme.md', function(err, res){
// check `err`, then do `res.pipe(..)` or `res.resume()` or whatever.
});
DELETE
Delete our file:
client.del('/test/Readme.md').on('response', function(res){
console.log(res.statusCode);
console.log(res.headers);
}).end();
Likewise we also have Client#deleteFile()
as a more concise (yet less
flexible) solution:
client.deleteFile('/test/Readme.md', function(err, res){
// check `err`, then do `res.pipe(..)` or `res.resume()` or whatever.
});
HEAD
As you might expect we have Client#head
and Client#headFile
, following the
same pattern as above.
Advanced Operations
Knox supports a few advanced operations. Like copying files:
client.copy('/test/source.txt', '/test/dest.txt').on('response', function(res){
console.log(res.statusCode);
console.log(res.headers);
}).end();
// or
client.copyFile('/source.txt', '/dest.txt', function(err, res){
// ...
});
even between buckets:
client.copyTo('/source.txt', 'dest-bucket', '/dest.txt').on('response', function(res){
// ...
}).end();
and even between buckets in different regions:
var destOptions = { region: 'us-west-2', bucket: 'dest-bucket' };
client.copyTo('/source.txt', destOptions, '/dest.txt', function(res){
// ...
}).end();
or deleting multiple files at once:
client.deleteMultiple(['/test/Readme.md', '/test/Readme.markdown'], function(err, res){
// ...
});
or listing all the files in your bucket:
client.list({ prefix: 'my-prefix' }, function(err, data){
/* `data` will look roughly like:
{
Prefix: 'my-prefix',
IsTruncated: true,
MaxKeys: 1000,
Contents: [
{
Key: 'whatever'
LastModified: new Date(2012, 11, 25, 0, 0, 0),
ETag: 'whatever',
Size: 123,
Owner: 'you',
StorageClass: 'whatever'
},
⋮
]
}
*/
});
And you can always issue ad-hoc requests, e.g. the following to get an object's ACL:
client.request('GET', '/test/Readme.md?acl').on('response', function(res){
// Read and parse the XML response.
// Everyone loves XML parsing.
}).end();
Finally, you can construct HTTP or HTTPS URLs for a file like so:
var readmeUrl = client.http('/test/Readme.md');
var userDataUrl = client.https('/user.json');
Client Creation Options
Besides the required key
, secret
, and bucket
options, you can supply any
of the following:
endpoint
By default knox will send all requests to the global endpoint
(s3.amazonaws.com). This works regardless of the region where the bucket
is. But if you want to manually set the endpoint, e.g. for performance or
testing reasons, or because you are using a S3-compatible service that isn't
hosted by Amazon, you can do it with the endpoint
option.
region
For your convenience when using buckets not in the US Standard region, you can
specify the region
option. When you do so, the endpoint
is automatically
assembled.
As of this writing, valid values for the region
option are:
- US Standard (default):
us-standard
- US West (Oregon):
us-west-2
- US West (Northern California):
us-west-1
- EU (Ireland):
eu-west-1
- Asia Pacific (Singapore):
ap-southeast-1
- Asia Pacific (Tokyo):
ap-northeast-1
- South America (Sao Paulo):
sa-east-1
If new regions are added later, their subdomain names will also work when passed
as the region
option. See the AWS endpoint documentation for
the latest list.
Convenience APIs such as putFile
and putStream
currently do not work as
expected with buckets in regions other than US Standard without explicitly
specify the region option. This will eventually be addressed by resolving
issue #66; however, for performance reasons, it is always best to specify
the region option anyway.
secure
and port
By default, knox uses HTTPS to connect to S3 on port 443. You can override
either of these with the secure
and port
options. Note that if you specify a
custom port
option, the default for secure
switches to false
, although
you can override it manually if you want to run HTTPS against a specific port.
token
If you are using the AWS Security Token Service APIs, you can construct
the client with a token
parameter containing the temporary security
credentials token. This simply sets the x-amz-security-token header on every
request made by the client.
style
By default, knox tries to use the "virtual hosted style" URLs for accessing S3,
e.g. bucket.s3.amazonaws.com
. If you pass in "path"
as the style
option,
or pass in a bucket
value that cannot be used with virtual hosted style URLs,
knox will use "path style" URLs, e.g. s3.amazonaws.com/bucket
. There are
tradeoffs you should be aware of:
- Virtual hosted style URLs can work with any region, without requiring it to be explicitly specified; path style URLs cannot.
- You can access programmatically-created buckets only by using virtual hosted style URLs; path style URLs will not work.
- You can access buckets with periods in their names over SSL using path style URLs; virtual host style URLs will not work unless you turn off certificate validation.
- You can access buckets with mixed-case names only using path style URLs; virtual host style URLs will not work.
For more information on the differences between these two types of URLs, and limitations related to them, see the following S3 documentation pages:
agent
Knox disables the default HTTP agent, because it leads to lots of "socket
hang up" errors when doing more than 5 requests at once. See #116 for
details. If you want to get the default agent back, you can specify
agent: require("https").globalAgent
, or use your own.
Beyond Knox
Multipart Upload
S3's multipart upload is their rather-complicated way of uploading large files. In particular, it is the only way of streaming files without knowing their Content-Length ahead of time.
Adding the complexity of multipart upload directly to knox is not a great idea. For example, it requires buffering at least 5 MiB of data at a time in memory, which you want to avoid if possible. Fortunately, @nathanoehlman has created the excellent knox-mpu package to let you use multipart upload with knox if you need it!
Easy Download/Upload
@superjoe30 has created a nice library, called simply s3, that makes it very easy to upload local files directly to S3, and download them back to your filesystem. For simple cases this is often exactly what you want!
Uploading With Retries and Exponential Backoff
@jergason created intimidate, a library wrapping Knox to automatically retry failed uploads with exponential backoff. This helps your app deal with intermittent connectivity to S3 without bringing it to a ginding halt.
Listing and Copying Large Buckets
@goodeggs created knox-copy to easily copy and stream keys of buckets beyond Amazon's 1000 key page size limit.
@segmentio created s3-lister to stream a list of bucket keys using the new streams2 interface.
@drob created s3-deleter, a writable stream that batch-deletes bucket keys.
Running Tests
To run the test suite you must first have an S3 account. Then create a file named ./test/auth.json, which contains your credentials as JSON, for example:
{
"key": "<api-key-here>",
"secret": "<secret-here>",
"bucket": "<your-bucket-name>",
"bucket2": "<another-bucket-name>",
"bucketUsWest2": "<bucket-in-us-west-2-region-here>"
}
Then install the dev dependencies and execute the test suite:
$ npm install
$ npm test