Apify (@apify)

Top repositories

1

crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
TypeScript
12,707
star
2

proxy-chain

Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
JavaScript
798
star
3

fingerprint-suite

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
TypeScript
760
star
4

got-scraping

HTTP client made for scraping based on got.
TypeScript
417
star
5

actor-page-analyzer

Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
JavaScript
149
star
6

apify-cli

Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.
TypeScript
115
star
7

actor-scraper

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
JavaScript
114
star
8

apify-sdk-python

The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
Python
110
star
9

apify-sdk-js

Apify SDK monorepo
TypeScript
108
star
10

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
TypeScript
87
star
11

apify-actor-docker

Base Docker images for Apify actors.
Dockerfile
64
star
12

fingerprint-generator

Generates realistic browser fingerprints
TypeScript
63
star
13

fingerprint-injector

Home of fingerprint injector.
TypeScript
62
star
14

header-generator

NodeJs package for generating browser-like headers.
TypeScript
62
star
15

apify-client-js

Apify API client for JavaScript / Node.js.
TypeScript
61
star
16

covid-19

Open APIs with statistics about Covid-19
JavaScript
45
star
17

apify-client-python

Apify API client for Python
Python
42
star
18

apify-docs

This project is the home of Apify's documentation.
API Blueprint
22
star
19

xlsx-stream

JavaScript / Node.js library to stream data into an XLSX file
JavaScript
22
star
20

apify-ts

Crawlee dev repo
TypeScript
21
star
21

got-cjs

An action to release a CommonJS version of the popular library got, which is soon to be available only in an ESM format.
JavaScript
21
star
22

actor-templates

This project is the 🏠 home of Apify actor template projects to help users quickly get started.
Python
21
star
23

actor-content-checker

You can use this act to monitor any page's content and get a notification when content changes.
JavaScript
18
star
24

actor-web-automation-agent

This is the experimental version of Web Automation Agent. The agent uses natural language instructions to browse the web and extract data.
TypeScript
16
star
25

actor-quick-start

Contains a boilerplate of an Apify actor to help you get started quickly build your own actors.
Dockerfile
15
star
26

devtools-server

Runs a simple server that allows you to connect to Chrome DevTools running on dynamic hosts, not only localhost.
JavaScript
13
star
27

apify-shared-js

Utilities and constants shared across Apify projects.
TypeScript
11
star
28

better-sqlite3-with-prebuilds

Better SQLite prebuild & publish action
10
star
29

chat-with-a-website

A simple app that lets you chat with a given website.
Python
9
star
30

actor-scrapy-executor

Apify actor to run web spiders written in Python in the Scrapy library
Python
9
star
31

apify-zapier-integration

Apify integration for Zapier
JavaScript
8
star
32

homebrew-tap

A Homebrew tap for Apify tools
Ruby
7
star
33

workflows

Apify's reusable github workflows
6
star
34

actor-legacy-phantomjs-crawler

The actor implements the legacy Apify Crawler product. It uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of JavaScript code.
JavaScript
6
star
35

idcac

I Don't Care About Cookies extension compiled for use with Playwright/Puppeteer
JavaScript
6
star
36

act-crawler-results-to-s3

Apify actor to upload crawler results to AWS S3.
JavaScript
6
star
37

super-scraper

Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!
TypeScript
5
star
38

actor-example-python

Example Apify Actor written in Python
Python
5
star
39

browser-headers-generator

Package generating randomized browser-like headers.
JavaScript
4
star
40

input-schema-editor-react

Apify input schema editor written in React.js
JavaScript
4
star
41

act-crawl-url-list

Apify actor to crawl a list of URLs
JavaScript
4
star
42

apify-storage-local-js

Local emulation of the apify-client NPM package, which enables local use of Apify SDK.
TypeScript
3
star
43

actor-imagediff

Returns an image containing difference of two given images.
JavaScript
3
star
44

apify-web-covid-19

A list of public COVID-19 APIs to be rendered on https://apify.com/covid-19
JavaScript
3
star
45

http-request

A HTTP request library for Node.js, with a common-sense API, support for Brotli compression and without bugs in "request" NPM package
JavaScript
3
star
46

aidevworld2023

How to get clean web data for chatbots and LLMs slides and supporting materials.
JavaScript
3
star
47

crawlee-parallel-scraping-example

An example repository showcasing how you can scrape in parallel using one request queue
TypeScript
3
star
48

actor-example-php

Example of Apify actor using PHP
PHP
2
star
49

apify-php-tutorial

PHP
2
star
50

actor-example-proxy-intercept-request

Example: Intercept requests from https connection using "Man in the middle" proxy solution.
JavaScript
2
star
51

apify-eslint-config

Apify ESLint preset to be shared between projects
JavaScript
2
star
52

slack-messages-action

It wraps up messages sending from Apify GitHub workflows into Slack.
TypeScript
2
star
53

scraping-tools-js

A library of utility functions that make scraping, data extraction and usage of headless browsers easier and faster.
JavaScript
2
star
54

actor-beautifulsoup-scraper

Python
2
star
55

apify-tsconfig

TypeScript configuration shared across projects in Apify.
Shell
1
star
56

generative-bayesian-network

JavaScript
1
star
57

waw-file-specification

Contains specification of the Web Automation Workflow (WAW) file.
1
star
58

playwright-test-actor

Source code for the Playwright Test public actor.
TypeScript
1
star
59

apify-sdk-v2

Snapshot of Apify SDK v2 + sdk.apify.com website. This project is no longer maintained. See the https://github.com/apify/apify-sdk-js repo instead!
JavaScript
1
star
60

actor-algolia-website-indexer

Apify actor that crawls website and indexes selected web pages to Algolia index. It's used to power the search on https://help.apify.com
JavaScript
1
star
61

apify-eslint-config-ts

Typescript ESLint configuration shared across projects in Apify.
JavaScript
1
star
62

actor-proxy-test

JavaScript
1
star
63

appmixer-components

Home of all the future Appmixer components on the Apify platform.
JavaScript
1
star
64

actor-example-secret-input

Example actor showcasing the secret input fields
Dockerfile
1
star
65

actor-scrapy-books-example

Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
Python
1
star
66

komparz

Special, yet insignificant actors
JavaScript
1
star
67

actor-crawler-cheerio

DEPRECATED: An actor that crawls websites and parses HTML pages using Cheerio library. Supports recursive crawling as well as URL lists.
JavaScript
1
star
68

actor-crawler-puppeteer

DEPRECATED: An Apify actor that enables crawling of websites using headless Chrome and Puppeteer. The actor is highly customizable and supports recursive crawling of websites as well as lists of URLs.
JavaScript
1
star
69

scrapy-migrator

A standalone POC script for wrapping Scrapy projects with Apify middleware.
Python
1
star