• Stars
    star
    163
  • Rank 231,141 (Top 5 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created about 6 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

๐Ÿ“ƒ Webใƒšใƒผใ‚ธใ‚’ใใฎใพใพใฎ็Šถๆ…‹ใงใ‚ขใƒผใ‚ซใ‚คใƒ–ใ™ใ‚‹ใƒ„ใƒผใƒซ

๐Ÿ“ƒ Vanilla Clipper

ๆ—ฅๆœฌ่ชž (Qiita)

Vanilla Clipper is a Node.js library to completely save a webpage to local with Puppeteer. You can save all the contents in the page such as images, videos, CSS, web fonts, iframes, and Shadow DOMs with one command.

Dependencies

  • Node.js (>= 8.10)
  • Chrome or Chromium (Latest version)

Installation

yarn global add vanilla-clipper
# or
npm i -g vanilla-clipper

Usage

CLI

Note: If it fails to launch, try adding --no-sandbox (-n) option.

  • Save https://example.com:

    vanilla-clipper https://example.com
  • Save .timeline element in https://example.com to tech directory (Set browser language to Japanese):

    vanilla-clipper -d tech -s .timeline -l ja-JP https://example.com
  • Login with sub account in the config file:

    vanilla-clipper -a sub https://example.com

See here for details of the options.

๐Ÿ“‚ Directory structure in ~/.vanilla-clipper

๐Ÿ“‚ .vanilla-clipper
   ๐Ÿ“‚ pages
      ๐Ÿ“‚ main
         ๐Ÿ“ƒ 20190213-page1.html
         ๏ธ™
      ๐Ÿ“‚ {SOME_FOLDER}
         ๐Ÿ“ƒ 20190213-page2.html
         ๐Ÿ“ƒ 20190214-page3.html
         ๏ธ™

   ๐Ÿ“‚ resources
      ๐Ÿ“‚ 20190213
         ๐Ÿ“Ž {ulid}.jpg
         ๐Ÿ“Ž {ulid}.svg
         ๏ธ™
      ๐Ÿ“‚ 20190214
         ๐Ÿ“Ž {ulid}.woff2
         ๏ธ™

   ๐Ÿ’Ž resources.json
   ๐Ÿ’Ž config.json

โš™๏ธ Config file example

{YOUR_HOME_DIRECTORY}/.vanilla-clipper/config.js

module.exports = {
    resource: { maxSize: 50 * 1024 * 1024 },
    sites: [
        {
            url: 'example.com', // site URL
            accounts: {
                default: {
                    // โ†‘ account label
                    username: 'main', // or () => 'main'
                    password: 'password1',
                },
                sub: {
                    // โ†‘ account label
                    username: 'sub_account',
                    password: 'password2',
                },
            },
            login: [
                // [action, arg1, arg2, ...]
                [
                    'goto',
                    'https://example.com/login', // URL
                ],
                [
                    'input',
                    'input[name="session[username_or_email]"]', // selector
                    '$username', // -> accounts.{ACCOUNT_LABEL}.username
                ],
                [
                    'input',
                    'input[name="session[password]"]', // selector
                    '$password', // -> accounts.{ACCOUNT_LABEL}.password
                ],
                [
                    'submit',
                    '[role=button]', // selector
                ],
            ],
        },
    ],
}