• Stars
    star
    3,111
  • Rank 14,450 (Top 0.3 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created about 6 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO ๐Ÿ‘‰

๐Ÿ”ค English | ๐Ÿ€„ ็ฎ€ไฝ“ไธญๆ–‡

ScrapydWeb: Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization.

PyPI - scrapydweb Version PyPI - Python Version CircleCI codecov Coverage Status Downloads - total GitHub license Twitter

servers

Scrapyd โŒ ScrapydWeb โŒ LogParser

๐Ÿ“– Recommended Reading

๐Ÿ”— How to efficiently manage your distributed web scraping projects

๐Ÿ”— How to set up Scrapyd cluster on Heroku

๐Ÿ‘€ Demo

๐Ÿ”— scrapydweb.herokuapp.com

โญ Features

View contents
  • ๐Ÿ’  Scrapyd Cluster Management

    • ๐Ÿ’ฏ All Scrapyd JSON API Supported
    • โ˜‘๏ธ Group, filter and select any number of nodes
    • ๐Ÿ–ฑ๏ธ Execute command on multinodes with just a few clicks
  • ๐Ÿ” Scrapy Log Analysis

    • ๐Ÿ“Š Stats collection
    • ๐Ÿ“ˆ Progress visualization
    • ๐Ÿ“‘ Logs categorization
  • ๐Ÿ”‹ Enhancements

    • ๐Ÿ“ฆ Auto packaging
    • ๐Ÿ•ต๏ธโ€โ™‚๏ธ Integrated with ๐Ÿ”— LogParser
    • โฐ Timer tasks
    • ๐Ÿ“ง Monitor & Alert
    • ๐Ÿ“ฑ Mobile UI
    • ๐Ÿ” Basic auth for web UI

๐Ÿ’ป Getting Started

View contents

โš ๏ธ Prerequisites

โ— Make sure that ๐Ÿ”— Scrapyd has been installed and started on all of your hosts.

โ€ผ๏ธ Note that for remote access, you have to manually set 'bind_address = 0.0.0.0' in ๐Ÿ”— the configuration file of Scrapyd and restart Scrapyd to make it visible externally.

โฌ‡๏ธ Install

  • Use pip:
pip install scrapydweb

โ— Note that you may need to execute python -m pip install --upgrade pip first in order to get the latest version of scrapydweb, or download the tar.gz file from https://pypi.org/project/scrapydweb/#files and get it installed via pip install scrapydweb-x.x.x.tar.gz

  • Use git:
pip install --upgrade git+https://github.com/my8100/scrapydweb.git

Or:

git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install

โ–ถ๏ธ Start

  1. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings at the first startup.)
  2. Visit http://127.0.0.1:5000 (It's recommended to use Google Chrome for a better experience.)

๐ŸŒ Browser Support

The latest version of Google Chrome, Firefox, and Safari.

โœ”๏ธ Running the tests

View contents
$ git clone https://github.com/my8100/scrapydweb.git
$ cd scrapydweb

# To create isolated Python environments
$ pip install virtualenv
$ virtualenv venv/scrapydweb
# Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb
$ source venv/scrapydweb/bin/activate

# Install dependent libraries
(scrapydweb) $ python setup.py install
(scrapydweb) $ pip install pytest
(scrapydweb) $ pip install coverage

# Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py
(scrapydweb) $ vi tests/conftest.py
(scrapydweb) $ curl http://127.0.0.1:6800

# '-x': stop on first failure
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv -x
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv --disable-warnings
(scrapydweb) $ coverage report
# To create an HTML report, check out htmlcov/index.html
(scrapydweb) $ coverage html

๐Ÿ—๏ธ Built With

View contents

๐Ÿ“‹ Changelog

Detailed changes for each release are documented in the ๐Ÿ”— HISTORY.md.

๐Ÿ‘จโ€๐Ÿ’ป Author


my8100

๐Ÿ‘ฅ Contributors


Kaisla

ยฉ๏ธ License

This project is licensed under the GNU General Public License v3.0 - see the ๐Ÿ”— LICENSE file for details.