• Stars
    star
    40
  • Rank 680,660 (Top 14 %)
  • Language
    Python
  • License
    Creative Commons ...
  • Created over 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A listing of world wide web archives, for humans and machines using Web Archive Manifest (WAM) yaml format

More Repositories

1

pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
JavaScript
1,366
star
2

archiveweb.page

A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!
JavaScript
841
star
3

replayweb.page

Serverless replay of web archives directly in the browser
TypeScript
693
star
4

browsertrix-crawler

Run a high-fidelity browser-based crawler in a single Docker container
TypeScript
547
star
5

webrecorder-player

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)
JavaScript
423
star
6

warcio

Streaming WARC/ARC library for fast web archive IO
Python
345
star
7

webrecorder-desktop

Webrecorder Desktop App!
JavaScript
201
star
8

browsertrix

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
TypeScript
178
star
9

specs

Specifications developed and maintained by the Webrecorder community.
HTML
117
star
10

wabac.js

wabac.js - Web Archive Browsing Augmentation Client
JavaScript
98
star
11

browsertrix-old

Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System
Python
87
star
12

wombat

Wombat.js client-side rewriting library
JavaScript
81
star
13

warcit

Convert Directories, Files and ZIP Files to Web Archives (WARC)
Python
79
star
14

har2warc

Convert HTTP Archive (HAR) -> Web Archive (WARC) format
Python
42
star
15

warcio.js

JS Streaming WARC IO optimized for Browser and Node
TypeScript
34
star
16

py-wacz

Python
32
star
17

browsertrix-behaviors

Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.
TypeScript
30
star
18

archiveweb.page-site

The ArchiveWeb.page Site
HTML
27
star
19

wsgiprox

Python WSGI Middleware for adding HTTP/S proxy support to any WSGI Application
Python
22
star
20

cdxj-indexer

CDXJ Indexing of WARC/ARCs
Python
21
star
21

web-replay-gen

Static Site Generator for Viewing Web Archives (in WACZ) format
JavaScript
19
star
22

oembed.link

A Cloudflare Worker to render embeds on a single page using oEmbed
JavaScript
14
star
23

dat-share

A prototype server to swarm multiple DATs for Webrecorder
JavaScript
12
star
24

pywb-remote-browsers

Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives
Python
12
star
25

markdown-to-respec

A Github Action for turning Markdown into ReSpec HTML
Python
12
star
26

behaviors

Webrecorder Automated In-Page Behavior Framework
JavaScript
11
star
27

dat-s3-hybrid-storage

A S3 hybrid storage interface for dat and hyperdrive
JavaScript
11
star
28

express.archiveweb.page

ArchiveWeb.page Express!
JavaScript
8
star
29

wacz-auth-spec

Specification for authentication and creating signed WACZ Files
8
star
30

platform-spec

Discussion of the broader Webrecorder platform spec
7
star
31

autobrowser

Python
7
star
32

authsign

Python
7
star
33

kubecaptures-backend

JavaScript
6
star
34

awp-sw

JavaScript
6
star
35

sup-digital-web-archives

A collection of self-hostable web archive built for Stanford University Press (SUP)
HTML
6
star
36

example-webarchive

This is an example web archive using the ReplayWebPage component.
HTML
6
star
37

save-tweet-now

Save Tweet Now (to IPFS)
JavaScript
6
star
38

browserkube

Webrecorder Kubernetes-native Browser Ochestration
JavaScript
5
star
39

webrecorder-tests

QA tests for webrecorder player (WORK IN PROGRESS)
Python
5
star
40

autoscalar

Webrecorder Auto Archiver for Scalar Prototype
Python
5
star
41

browsertrix-browser-base

Dockerfile
5
star
42

wacz-uploader

A straightforward single page application for uploading your WACZ archives to IPFS
JavaScript
5
star
43

wabac-cors-proxy

CORS proxy for use with wabac.js-based tools
JavaScript
4
star
44

ipfs-composite-files

CLI and library for create composite files in IPFS
JavaScript
3
star
45

wabac.js-1.0

JavaScript
3
star
46

wacz2car

Conver WACZ files to CAR files for uploading to IPFS.
JavaScript
3
star
47

proxy

(Deprecated) Old Webrecorder proxy component based on mitmproxy
Python
2
star
48

community

Webrecorder Community Info
2
star
49

kubecaptures-ui

JavaScript
2
star
50

mapbox-driver

Browsertrix Crawler driver for Mapbox maps
JavaScript
2
star
51

dashboard-custom-drivers

Custom drivers for browsertrix crawler
JavaScript
1
star
52

sucho-web-archive

JavaScript
1
star
53

functional-spec

A description of the interface and functionalities of webrecorder.io
1
star