• Stars
    star
    25
  • Rank 952,138 (Top 19 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created about 8 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scrapy spider middleware to clean up query parameters in request URLs

More Repositories

1

scrapy-splash

Scrapy+Splash for JavaScript integration
Python
3,115
star
2

scrapy-playwright

🎭 Playwright integration for Scrapy
Python
944
star
3

scrapy-djangoitem

Scrapy extension to write scraped items using Django models
Python
499
star
4

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Python
354
star
5

scrapy-jsonrpc

Scrapy extension to control spiders using JSON-RPC
Python
296
star
6

scrapy-deltafetch

Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls
Python
264
star
7

scrapy-magicfields

Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
Python
55
star
8

scrapy-jsonschema

Scrapy schema validation pipeline and Item builder using JSON Schema
Python
44
star
9

scrapy-monkeylearn

A Scrapy pipeline to categorize items using MonkeyLearn
Python
37
star
10

scrapy-zyte-api

Zyte API integration for Scrapy
Python
33
star
11

scrapy-headless

Python
29
star
12

scrapy-pagestorage

A scrapy extension to store requests and responses information in storage service
Python
26
star
13

scrapy-splitvariants

Scrapy spider middleware to split an item into multiple items using a multi-valued key
Python
20
star
14

scrapy-streaming

Python
17
star
15

scrapy-dotpersistence

A scrapy extension to sync `.scrapy` folder to an S3 bucket
Python
16
star
16

scrapy-streamitem

Scrapy support for working with streamcorpus Stream Items.
Python
11
star
17

scrapy-crawlera-fetch

Scrapy Downloader Middleware for Crawlera Fetch API
Python
8
star
18

scrapy-feedexporter-sftp

Python
6
star
19

scrapy-statsd

Python
6
star
20

scrapy-bigml

Scrapy pipeline for writing items to BigML datasets
Python
4
star
21

scrapy-spider-metadata

Python
4
star
22

scrapy-hcf

Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
Python
4
star
23

scrapy-snowflake-stage-exporter

Snowflake database loading utility with Scrapy integration
Python
4
star
24

scrapy-feedexporter-google-drive

Python
3
star
25

scrapy-feedexporter-azure-storage

Python
2
star
26

scrapy-feedexporter-onedrive

Export to OneDrive
Python
1
star
27

scrapy-incremental

Python
1
star
28

scrapy-feedexporter-dropbox

Scrapy feed exporter for Dropbox
Python
1
star
29

scrapy-feedexporter-google-sheets

Python
1
star