• Stars
    star
    247
  • Rank 164,117 (Top 4 %)
  • Language
    TypeScript
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Puppeteer Core fork that works with Cloudflare Browser Workers

Workers version of Puppeteer Core

This repo is a fork of main puppeteer project. It creates a version of puppeteer core specialized for use in Cloudflare workers.

The goals of the fork are:

  • Support as much of the existing puppeteer core lib as possible.
  • Minimize the size of the library for workers developers, since library space is at a premium in workers projects.
  • Make library use as seamless as possible in workers.

Note that the main branch in this repo is branched off of version 17.0.0 of the library, to match the currently deployed version of Chromium on the edge.

Original README follows...

Puppeteer

Build status npm puppeteer package

API | FAQ | Contributing | Troubleshooting

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

What can I do?

Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:

  • Generate screenshots and PDFs of pages.
  • Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
  • Automate form submission, UI testing, keyboard input, etc.
  • Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  • Capture a timeline trace of your site to help diagnose performance issues.
  • Test Chrome Extensions.

Getting Started

Installation

To use Puppeteer in your project, run:

npm i puppeteer
# or "yarn add puppeteer"

When you install Puppeteer, it downloads a recent version of Chromium (~170MB Mac, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API (customizable through Environment Variables). For a version of Puppeteer purely for connection, see puppeteer-core.

Environment Variables

Puppeteer looks for certain environment variables to aid its operations. If Puppeteer doesn't find them in the environment during the installation step, a lowercased variant of these variables will be used from the npm config.

  • HTTP_PROXY, HTTPS_PROXY, NO_PROXY - defines HTTP proxy settings that are used to download and run the browser.
  • PUPPETEER_SKIP_CHROMIUM_DOWNLOAD - do not download bundled Chromium during installation step.
  • PUPPETEER_TMP_DIR - defines the directory to be used by Puppeteer for creating temporary files. Defaults to os.tmpdir().
  • PUPPETEER_DOWNLOAD_HOST - overwrite URL prefix that is used to download Chromium. Note: this includes protocol and might even include path prefix. Defaults to https://storage.googleapis.com.
  • PUPPETEER_DOWNLOAD_PATH - overwrite the path for the downloads folder. Defaults to <root>/.local-chromium, where <root> is Puppeteer's package root.
  • PUPPETEER_CHROMIUM_REVISION - specify a certain version of Chromium you'd like Puppeteer to use. See puppeteer.launch on how executable path is inferred.
  • PUPPETEER_EXECUTABLE_PATH - specify an executable path to be used in puppeteer.launch.
  • PUPPETEER_PRODUCT - specify which browser you'd like Puppeteer to use. Must be one of chrome or firefox. This can also be used during installation to fetch the recommended browser binary. Setting product programmatically in puppeteer.launch supersedes this environment variable. The product is exposed in puppeteer.product
  • PUPPETEER_EXPERIMENTAL_CHROMIUM_MAC_ARM โ€” specify Puppeteer download Chromium for Apple M1. On Apple M1 devices Puppeteer by default downloads the version for Intel's processor which runs via Rosetta. It works without any problems, however, with this option, you should get more efficient resource usage (CPU and RAM) that could lead to a faster execution time.

:::danger

Puppeteer is only guaranteed to work with the bundled Chromium, use at your own risk.

:::

:::caution

PUPPETEER_* env variables are not accounted for in puppeteer-core.

:::

puppeteer-core

Every release since v1.7.0 we publish two packages:

puppeteer is a product for browser automation. When installed, it downloads a version of Chromium, which it then drives using puppeteer-core. Being an end-user product, puppeteer supports a bunch of convenient PUPPETEER_* env variables to tweak its behavior.

puppeteer-core is a library to help drive anything that supports DevTools protocol. puppeteer-core doesn't download Chromium when installed. Being a library, puppeteer-core is fully driven through its programmatic interface and disregards all the PUPPETEER_* env variables.

To sum up, the only differences between puppeteer-core and puppeteer are:

  • puppeteer-core doesn't automatically download Chromium when installed.
  • puppeteer-core ignores all PUPPETEER_* env variables.

In most cases, you'll be fine using the puppeteer package.

However, you should use puppeteer-core if:

  • you're building another end-user product or library atop of DevTools protocol. For example, one might build a PDF generator using puppeteer-core and write a custom install.js script that downloads headless_shell instead of Chromium to save disk space.
  • you're bundling Puppeteer to use in Chrome Extension / browser with the DevTools protocol where downloading an additional Chromium binary is unnecessary.
  • you're building a set of tools where puppeteer-core is one of the ingredients and you want to postpone install.js script execution until Chromium is about to be used.

When using puppeteer-core, remember to change the include line:

const puppeteer = require('puppeteer-core');

You will then need to call puppeteer.connect or puppeteer.launch with an explicit executablePath or channel option.

Usage

Puppeteer follows the latest maintenance LTS version of Node.

Puppeteer will be familiar to people using other browser testing frameworks. You create an instance of Browser, open pages, and then manipulate them with Puppeteer's API.

Example - navigating to https://example.com and saving a screenshot as example.png:

Save file as example.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({path: 'example.png'});

  await browser.close();
})();

Execute script on the command line

node example.js

Puppeteer sets an initial page size to 800ร—600px, which defines the screenshot size. The page size can be customized with Page.setViewport().

Example - create a PDF.

Save file as hn.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://news.ycombinator.com', {
    waitUntil: 'networkidle2',
  });
  await page.pdf({path: 'hn.pdf', format: 'a4'});

  await browser.close();
})();

Execute script on the command line

node hn.js

See Page.pdf for more information about creating pdfs.

Example - evaluate script in the context of the page

Save file as get-dimensions.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');

  // Get the "viewport" of the page, as reported by the page.
  const dimensions = await page.evaluate(() => {
    return {
      width: document.documentElement.clientWidth,
      height: document.documentElement.clientHeight,
      deviceScaleFactor: window.devicePixelRatio,
    };
  });

  console.log('Dimensions:', dimensions);

  await browser.close();
})();

Execute script on the command line

node get-dimensions.js

See Page.evaluate and related methods like Page.evaluateOnNewDocument and Page.exposeFunction.

Running in Docker

Puppeteer offers a Docker image that includes Chromium along with the required dependencies and a pre-installed Puppeteer version. The image is available via the GitHub Container Registry. The latest image is tagged as latest and other tags match Puppeteer versions. For example,

docker pull ghcr.io/puppeteer/puppeteer:latest # pulls the latest
docker pull ghcr.io/puppeteer/puppeteer:16.1.0 # pulls the image that contains Puppeteer v16.1.0

The image is meant for running the browser in the sandbox mode and therefore, running the image requires the SYS_ADMIN capability. For example,

docker run -i --init --cap-add=SYS_ADMIN --rm ghcr.io/puppeteer/puppeteer:latest node -e "`cat docker/test/smoke-test.js`"

Replace the path to smoke-test.js with a path to your script. The script can import or require the puppeteer module because it's pre-installed inside the image.

Currently, the image includes the LTS version of Node.js. If you need to build an image based on a different base image, you can use our Dockerfile as the starting point.

Working with Chrome Extensions

Puppeteer can be used for testing Chrome Extensions.

:::caution

Extensions in Chrome / Chromium currently only work in non-headless mode and experimental Chrome headless mode.

:::

The following is code for getting a handle to the background page of an extension whose source is located in ./my-extension:

const puppeteer = require('puppeteer');

(async () => {
  const pathToExtension = require('path').join(__dirname, 'my-extension');
  const browser = await puppeteer.launch({
    headless: 'chrome',
    args: [
      `--disable-extensions-except=${pathToExtension}`,
      `--load-extension=${pathToExtension}`,
    ],
  });
  const backgroundPageTarget = await browser.waitForTarget(
    target => target.type() === 'background_page'
  );
  const backgroundPage = await backgroundPageTarget.page();
  // Test the background page as you would any other page.
  await browser.close();
})();

:::note

Chrome Manifest V3 extensions have a background ServiceWorker of type 'service_worker', instead of a page of type 'background_page'.

:::

:::note

It is not yet possible to test extension popups or content scripts.

:::

Default runtime settings

1. Uses Headless mode

Puppeteer launches Chromium in headless mode. To launch a full version of Chromium, set the headless option when launching a browser:

const browser = await puppeteer.launch({headless: false}); // default is true

2. Runs a bundled version of Chromium

By default, Puppeteer downloads and uses a specific version of Chromium so its API is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

You can also use Puppeteer with Firefox Nightly (experimental support). See Puppeteer.launch for more information.

See this article for a description of the differences between Chromium and Chrome. This article describes some differences for Linux users.

3. Creates a fresh user profile

Puppeteer creates its own browser user profile which it cleans up on every run.

Resources

Debugging tips

  1. Turn off headless mode - sometimes it's useful to see what the browser is displaying. Instead of launching in headless mode, launch a full version of the browser using headless: false:

    const browser = await puppeteer.launch({headless: false});
  2. Slow it down - the slowMo option slows down Puppeteer operations by the specified amount of milliseconds. It's another way to help see what's going on.

    const browser = await puppeteer.launch({
      headless: false,
      slowMo: 250, // slow down by 250ms
    });
  3. Capture console output - You can listen for the console event. This is also handy when debugging code in page.evaluate():

    page.on('console', msg => console.log('PAGE LOG:', msg.text()));
    
    await page.evaluate(() => console.log(`url is ${location.href}`));
  4. Use debugger in application code browser

    There are two execution context: node.js that is running test code, and the browser running application code being tested. This lets you debug code in the application code browser; ie code inside evaluate().

    • Use {devtools: true} when launching Puppeteer:

      const browser = await puppeteer.launch({devtools: true});
    • Change default test timeout:

      jest: jest.setTimeout(100000);

      jasmine: jasmine.DEFAULT_TIMEOUT_INTERVAL = 100000;

      mocha: this.timeout(100000); (don't forget to change test to use function and not '=>')

    • Add an evaluate statement with debugger inside / add debugger to an existing evaluate statement:

      await page.evaluate(() => {
        debugger;
      });

      The test will now stop executing in the above evaluate statement, and chromium will stop in debug mode.

  5. Use debugger in node.js

    This will let you debug test code. For example, you can step over await page.click() in the node.js script and see the click happen in the application code browser.

    Note that you won't be able to run await page.click() in DevTools console due to this Chromium bug. So if you want to try something out, you have to add it to your test file.

    • Add debugger; to your test, eg:

      debugger;
      await page.click('a[target=_blank]');
    • Set headless to false

    • Run node --inspect-brk, eg node --inspect-brk node_modules/.bin/jest tests

    • In Chrome open chrome://inspect/#devices and click inspect

    • In the newly opened test browser, type F8 to resume test execution

    • Now your debugger will be hit and you can debug in the test browser

  6. Enable verbose logging - internal DevTools protocol traffic will be logged via the debug module under the puppeteer namespace.

     # Basic verbose logging
     env DEBUG="puppeteer:*" node script.js
    
     # Protocol traffic can be rather noisy. This example filters out all Network domain messages
     env DEBUG="puppeteer:*" env DEBUG_COLORS=true node script.js 2>&1 | grep -v '"Network'
    
  7. Debug your Puppeteer (node) code easily, using ndb

  • npm install -g ndb (or even better, use npx!)

  • add a debugger to your Puppeteer (node) code

  • add ndb (or npx ndb) before your test command. For example:

    ndb jest or ndb mocha (or npx ndb jest / npx ndb mocha)

  • debug your test inside chromium like a boss!

Contributing

Check out our contributing guide to get an overview of Puppeteer development.

FAQ

Our FAQ has migrated to our site.

More Repositories

1

pingora

A library for building fast, reliable and evolvable network services.
Rust
20,561
star
2

quiche

๐Ÿฅง Savoury implementation of the QUIC transport protocol and HTTP/3
Rust
9,191
star
3

cfssl

CFSSL: Cloudflare's PKI and TLS toolkit
Go
8,049
star
4

workerd

The JavaScript / Wasm runtime that powers Cloudflare Workers
C++
6,175
star
5

boringtun

Userspace WireGuardยฎ Implementation in Rust
Rust
6,001
star
6

cloudflared

Cloudflare Tunnel client (formerly Argo Tunnel)
Go
5,870
star
7

flan

A pretty sweet vulnerability scanner
Python
3,910
star
8

miniflare

๐Ÿ”ฅ Fully-local simulator for Cloudflare Workers. For the latest version, see https://github.com/cloudflare/workers-sdk/tree/main/packages/miniflare.
TypeScript
3,719
star
9

wrangler-legacy

๐Ÿค  Home to Wrangler v1 (deprecated)
Rust
3,233
star
10

cloudflare-docs

Cloudflareโ€™s documentation
MDX
3,009
star
11

tableflip

Graceful process restarts in Go
Go
2,549
star
12

workers-rs

Write Cloudflare Workers in 100% Rust via WebAssembly
Rust
2,478
star
13

workers-sdk

โ›…๏ธ Home to Wrangler, the CLI for Cloudflare Workersยฎ
TypeScript
2,464
star
14

wildebeest

Wildebeest is an ActivityPub and Mastodon-compatible server
TypeScript
2,042
star
15

gokey

A simple vaultless password manager in Go
Go
1,836
star
16

ebpf_exporter

Prometheus exporter for custom eBPF metrics
C
1,639
star
17

cloudflare-go

The official Go library for the Cloudflare API
Go
1,477
star
18

lol-html

Low output latency streaming HTML parser/rewriter with CSS selector-based API
Rust
1,459
star
19

orange

TypeScript
1,400
star
20

redoctober

Go server for two-man rule style file encryption and decryption.
Go
1,373
star
21

cf-ui

๐Ÿ’Ž Cloudflare UI Framework
JavaScript
1,297
star
22

sslconfig

Cloudflare's Internet facing SSL configuration
1,287
star
23

foundations

Cloudflare's Rust service foundations library.
Rust
1,273
star
24

next-on-pages

CLI to build and develop Next.js apps for Cloudflare Pages
TypeScript
1,184
star
25

hellogopher

Hellogopher: "just clone and make" your conventional Go project
Makefile
1,153
star
26

production-saas

(WIP) Example SaaS application built in public on the Cloudflare stack!
TypeScript
1,114
star
27

bpftools

BPF Tools - packet analyst toolkit
Python
1,087
star
28

cloudflare-blog

Cloudflare Blog code samples
C
1,065
star
29

templates

A collection of starter templates and examples for Cloudflare Workers and Pages
JavaScript
996
star
30

wrangler-action

๐Ÿง™โ€โ™€๏ธ easily deploy cloudflare workers applications using wrangler and github actions
TypeScript
993
star
31

circl

CIRCL: Cloudflare Interoperable Reusable Cryptographic Library
Go
970
star
32

cf-terraforming

A command line utility to facilitate terraforming your existing Cloudflare resources.
Go
966
star
33

wirefilter

An execution engine for Wireshark-like filters
Rust
947
star
34

workers-chat-demo

JavaScript
867
star
35

pint

Prometheus rule linter/validator
Go
827
star
36

utahfs

UtahFS is an encrypted storage system that provides a user-friendly FUSE drive backed by cloud storage.
Go
805
star
37

terraform-provider-cloudflare

Cloudflare Terraform Provider
Go
775
star
38

Stout

A reliable static website deploy tool
Go
749
star
39

goflow

The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare.
Go
729
star
40

unsee

Alert dashboard for Prometheus Alertmanager
Go
710
star
41

mitmengine

A MITM (monster-in-the-middle) detection tool. Used to build MALCOLM:
Go
690
star
42

workers-graphql-server

๐Ÿ”ฅLightning-fast, globally distributed Apollo GraphQL server, deployed at the edge using Cloudflare Workers
JavaScript
635
star
43

cloudflare-php

PHP library for the Cloudflare v4 API
PHP
616
star
44

react-gateway

Render React DOM into a new context (aka "Portal")
JavaScript
569
star
45

xdpcap

tcpdump like XDP packet capture
Go
567
star
46

ahocorasick

A Golang implementation of the Aho-Corasick string matching algorithm
Go
541
star
47

lua-resty-logger-socket

Raw-socket-based Logger Library for Nginx (based on ngx_lua)
Perl
477
star
48

mmap-sync

Rust library for concurrent data access, using memory-mapped files, zero-copy deserialization, and wait-free synchronization.
Rust
453
star
49

pages-action

JavaScript
450
star
50

speedtest

Component to perform network speed tests against Cloudflare's edge network
JavaScript
435
star
51

stpyv8

Python 3 and JavaScript interoperability. Successor To PyV8 (https://github.com/flier/pyv8)
C++
430
star
52

nginx-google-oauth

Lua module to add Google OAuth to nginx
Lua
425
star
53

worker-typescript-template

ส• โ€ขฬุˆโ€ขฬ€) TypeScript template for Cloudflare Workers
TypeScript
424
star
54

gokeyless

Go implementation of the keyless protocol
Go
420
star
55

golibs

Various small golang libraries
Go
402
star
56

sandbox

Simple Linux seccomp rules without writing any code
C
385
star
57

mmproxy

mmproxy, the magical PROXY protocol gateway
C
370
star
58

svg-hush

Make it safe to serve untrusted SVG files
Rust
368
star
59

boring

BoringSSL bindings for the Rust programming language.
Rust
357
star
60

cobweb

COBOL to WebAssembly compiler
COBOL
353
star
61

rustwasm-worker-template

A template for kick starting a Cloudflare Worker project using workers-rs. Write your Cloudflare Worker entirely in Rust!
Rust
350
star
62

workers-types

TypeScript type definitions for authoring Cloudflare Workers.
TypeScript
350
star
63

lua-resty-cookie

Lua library for HTTP cookie manipulations for OpenResty/ngx_lua
Perl
347
star
64

cloudflare-ingress-controller

A Kubernetes ingress controller for Cloudflare's Argo Tunnels
Go
344
star
65

node-cloudflare

Node.js API for Client API
JavaScript
335
star
66

serverless-registry

A Docker registry backed by Workers and R2.
TypeScript
327
star
67

cfweb3

JavaScript
313
star
68

workerskv.gui

(WIP) A cross-platform Desktop application for exploring Workers KV Namespace data
Svelte
306
star
69

JSON.is

Open-source documentation for common JSON formats.
JavaScript
302
star
70

sqlalchemy-clickhouse

Python
299
star
71

cloudflare.github.io

Cloudflare โค๏ธ Open Source
CSS
298
star
72

doom-wasm

Chocolate Doom WebAssembly port with WebSockets support
C
297
star
73

json-schema-tools

Packages for working with JSON Schema and JSON Hyper-Schema
JavaScript
296
star
74

chatgpt-plugin

Build ChatGPT plugins with Cloudflare's Developer Platform ๐Ÿค–
JavaScript
289
star
75

chanfana

OpenAPI 3 and 3.1 schema generator and validator for Hono, itty-router and more!
TypeScript
284
star
76

tls-tris

crypto/tls, now with 100% more 1.3. THE API IS NOT STABLE AND DOCUMENTATION IS NOT GUARANTEED.
Go
283
star
77

gortr

The RPKI-to-Router server used at Cloudflare
Go
283
star
78

react-modal2

๐Ÿ’ญ Simple modal component for React.
JavaScript
279
star
79

isbgpsafeyet.com

Is BGP safe yet?
HTML
278
star
80

keyless

Cloudflare's Keyless SSL Server Reference Implementation
C
272
star
81

pp-browser-extension

Client for Privacy Pass protocol providing unlinkable cryptographic tokens
TypeScript
268
star
82

dog

Durable Object Groups
TypeScript
268
star
83

tubular

BSD socket API on steroids
C
261
star
84

go

Go with Cloudflare experimental patches
Go
260
star
85

cloudflare-rs

Rust library for the Cloudflare v4 API
Rust
256
star
86

cloudflare-typescript

The official Typescript library for the Cloudflare API
TypeScript
251
star
87

shellflip

Graceful process restarts in Rust
Rust
245
star
88

kv-asset-handler

Routes requests to KV assets
TypeScript
244
star
89

mod_cloudflare

C
243
star
90

semver_bash

Semantic Versioning in Bash
Shell
238
star
91

cfssl_trust

CFSSL's CA trust store repository
Go
226
star
92

doca

A CLI tool that scaffolds API documentation based on JSON HyperSchemas.
JavaScript
224
star
93

alertmanager2es

Receives HTTP webhook notifications from AlertManager and inserts them into an Elasticsearch index for searching and analysis
Go
218
star
94

pmtud

Path MTU daemon - broadcast lost ICMP packets on ECMP networks
C
218
star
95

origin-ca-issuer

Go
216
star
96

worker-template-router

JavaScript
216
star
97

Cloudflare-WordPress

A Cloudflare plugin for WordPress
PHP
215
star
98

cloudflare-docs-engine

A documentation engine built on Gatsby, powering Cloudflareโ€™s docs https://github.com/cloudflare/cloudflare-docs
JavaScript
215
star
99

python-worker-hello-world

Python hello world for Cloudflare Workers
JavaScript
209
star
100

saffron

The cron parser powering Cron Triggers on Cloudflare Workers
Rust
207
star