• Stars
    star
    527
  • Rank 84,091 (Top 2 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created almost 12 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Capture HAR files from a Chrome instance

chrome-har-capturer

CI status

Capture HAR files from a Chrome instance.

Under the hood this module uses chrome-remote-interface to instrument Chrome.

Screenshot

Setup

Install this module from NPM:

npm install chrome-har-capturer

Start Chrome like this:

google-chrome --remote-debugging-port=9222 --headless

Command line utility

The command line utility can be used to generate HAR files from a list of URLs. The following options are available:

-h, --help               output usage information
-t, --host <host>        Chrome Debugging Protocol host
-p, --port <port>        Chrome Debugging Protocol port
-x, --width <dip>        frame width in DIP
-y, --height <dip>       frame height in DIP
-o, --output <file>      write to file instead of stdout
-c, --content            also capture the requests body
-k, --cache              allow caching
-a, --agent <agent>      user agent override
-b, --block <URL>        URL pattern (*) to block (can be repeated)
-H, --header <header>    Additional headers (can be repeated)
-i, --insecure           ignore certificate errors
-g, --grace <ms>         time to wait after the load event
-u, --timeout <ms>       time to wait before giving up with a URL
-r, --retry <number>     number of retries on page load failure
-e, --retry-delay <ms>   time to wait before starting a new attempt
-f, --abort-on-failure   stop after the first failure (incompatible with parallel mode)
-d, --post-data <bytes>  maximum POST data size to be returned
-l, --parallel <n>       load <n> URLs in parallel

Library

Alternatively this module provides a simple API that can be used to write custom applications. See the command line utility source code for a working example.

API

run(urls, [options])

Start the loading of a batch of URLs. Returns an event emitter (see below for the list of supported events).

urls is array of URLs.

options is an object with the following optional properties:

  • host: Chrome Debugging Protocol host. Defaults to localhost;

  • port: Chrome Debugging Protocol port. Defaults to 9222;

  • width: frame width in DIP. Defaults to a Chrome-defined value;

  • height: frame height in DIP. Defaults to a Chrome-defined value;

  • content: if true also capture the requests body. Defaults to false;

  • cache: if true allow caching. Defaults to false;

  • timeout: milliseconds to wait before giving up with a URL;

  • retry: number of retries on page load failure. Defaults to 0;

  • retryDelay: time to wait before starting a new attempt. Defaults to 0;

  • abortOnFailure: stop after the first failure (incompatible with parallel mode);

  • postData: maximum POST data size (in bytes) to be returned. Defaults to unlimited;

  • parallel: if true load the URLs in parallel (warning: this may spoil time-based metrics). Defaults to false;

  • preHook: function returning a Promise executed before each page load:

    • url: the current URL;
    • client: CDP client instance;
    • index: index of url in urls;
    • urls: input URL array.
  • postHook: function returning a Promise executed after each page load event:

    • url: the current URL;
    • client: CDP client instance;
    • index: index of url in urls;
    • urls: input URL array.

    If this hook resolves to a value then it is included in the resulting HAR object as the value of the _user key of the this URL's page object.

Event: 'load'
function (url, index, urls) {}

Emitted when Chrome is about to load url. index is the index of url in urls. urls is the array passed to run().

Event: 'done'
function (url, index, urls) {}

Emitted when Chrome finished loading url. index is the index of url in urls. urls is the array passed to run().

Event: fail'
function (url, err, index, urls) {}

Emitted when Chrome cannot load url. The Error object err contains the failure reason. Failed URLs will not appear in the resulting HAR object. index is the index of url in urls. urls is the array passed to run().

Event: 'har'
function (har) {}

Emitted when all the URLs have been processed. If all the URLs fails then a valid empty HAR object is returned. har is the resulting HAR object.

fromLog(url, log, [options])

Generate a single-page HAR from an array of raw events that comes from the Chrome Debugging Protocol (e.g., from chrome-remote-interface). Returns a Promise that fulfills to the generated HAR.

url is the page URL;

log is the array of events in the form:

{
    method: '...',
    params: {...}
}

Events to be provided are:

  • Page.domContentEventFired;
  • Page.loadEventFired;
  • Network.requestWillBeSent;
  • Network.dataReceived;
  • Network.responseReceived;
  • Network.resourceChangedPriority;
  • Network.loadingFinished;
  • Network.loadingFailed.

Additional events for WebSockets are:

  • Network.webSocketWillSendHandshakeRequest;
  • Network.webSocketHandshakeResponseReceived;
  • Network.webSocketClosed;
  • Network.webSocketFrameSent;
  • Network.webSocketFrameReceived.

options is an object with the following optional properties:

  • content: if true also expect the requests body. Defaults to false.

When content is true synthetic events in the following form are also expected:

{
    method: 'Network.getResponseBody',
    params: {
        requestId: '...',
        body: '...',
        base64Encoded: true/false
    }
}

These events contain the reply of the Network.getResponseBody method, this is needed because Chrome does not return the body content via events, instead it must be requested manually and the reply must be appended to the other events in the log.

Resources

More Repositories

1

gdb-dashboard

Modular visual interface for GDB in Python
Python
10,856
star
2

chrome-remote-interface

Chrome Debugging Protocol interface for Node.js
JavaScript
4,235
star
3

zoom

Fixed and automatic balanced window layout for Emacs
Emacs Lisp
355
star
4

zizzania

Automated DeAuth attack
C
275
star
5

fracker

PHP function tracker
JavaScript
241
star
6

mysql-unsha1

Authenticate against a MySQL server without knowing the cleartext password
C
222
star
7

prof

Self-contained C/C++ profiler library for Linux
C
177
star
8

gdb

Go GDB/MI interface
Go
79
star
9

comb

Interactive code auditing and grep tool in Emacs Lisp
Emacs Lisp
74
star
10

httpfs

Remote FUSE filesystem via server-side script
C
61
star
11

gproxy

googleusercontent.com as HTTP(S) proxy
JavaScript
54
star
12

trace

Start or attach to a process and monitor a customizable set of metrics (CPU, I/O, etc.)
Shell
34
star
13

chrome-page-graph

Chrome extension to generate interactive page dependency graphs
JavaScript
32
star
14

xkeylogger

Rootless keylogger for X
C
32
star
15

signal-wont-let-me-attach

Store arbitrary files inside PNGs to overcome nonsensical file type restrictions
Python
30
star
16

iq

I/Q file analysis toolkit in R
R
25
star
17

ratty

Record and replay terminal sessions
JavaScript
10
star
18

httpool

Go HTTP wrapper for limited concurrency handlers
Go
9
star
19

cyrus-and.github.io

Personal website
SCSS
7
star
20

lorem

Lorem ipsum generator as a Linux kernel module
C
6
star
21

biscuit

Modular HTTP cookie parser
Python
5
star
22

stash

Shell I/O clipboard
Shell
5
star
23

dotfiles

Personal dotfiles
Emacs Lisp
5
star
24

signal-desktop-docker

Scaffold to run Signal Desktop in a Docker container and persist data
Dockerfile
4
star
25

playground

Disposable Docker sandbox for quick isolated testing with X support
Dockerfile
4
star
26

dry-makefile

Opinionated Makefile for simple C/C++ projects
Makefile
3
star
27

synchttp

Synchronous Node.js HTTP and WebSocket library for API testing, scripting or automation
JavaScript
2
star