• Stars
    star
    187
  • Rank 205,856 (Top 5 %)
  • Language
    PHP
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PHP Link Checker

Fink

Build Status

Fink (pronounced "Phpink") is a command line tool, written in PHP, for checking HTTP links.

  • Check websites for broken links or error pages.
  • Asynchronous HTTP requests.

recording

Installation

Install as a stand-alone tool or as a project dependency:

Installing as a project dependency

$ composer require dantleech/fink --dev

Installing from a PHAR

Download the PHAR from the Releases page.

Building your own PHAR with Box

You can build your own PHAR by cloning this repository and running:

$ ./vendor/bin/box compile

Usage

Run the command with a single URL to start crawling:

$ ./vendor/bin/fink https://www.example.com

Use --output=somefile to log verbose information for each URL in JSON format, including:

  • url: The tested URL.
  • status: The HTTP status code.
  • referrer: The page which linked to the URL.
  • referrer_title: The value (e.g. link title) of the referring element.
  • referrer_xpath: The path to the node in the referring document.
  • distance: The number of links away from the start document.
  • request_time: Number of microseconds taken to make the request.
  • timestamp: The time that the request was made.
  • exception: Any runtime exception encountered (e.g. malformed URL, etc).

Arguments

  • url (multiple) Specify one or more base URLs to crawl (mandatory).

Options

  • --client-max-body-size: Max body size for HTTP client (in bytes).
  • --client-max-header-size: Max header size for HTTP client (in bytes).
  • --client-redirects=5: Set the maximum number of times the client should redirect (0 to never redirect).
  • --client-security-level=1: Set the default SSL security level
  • --client-timeout=15000: Set the maximum amount of time (in milliseconds) the client should wait for a response, defaults to 15,000 (15 seconds).
  • --concurrency: Number of simultaneous HTTP requests to use.
  • --display-bufsize=10: Set the number of URLs to consider when showing the display.
  • --display=+memory: Set, add or remove elements of the runtime display (prefix with - or + to modify the default set).
  • --exclude-url=logout: (multiple) Exclude URLs matching the given PCRE pattern.
  • --header="Foo: Bar": (multiple) Specify custom header(s).
  • --help: Display available options.
  • --include-link=foobar.html: Include given link as if it were linked from the base URL.
  • --insecure: Do not verify SSL certificates.
  • --load-cookies: Load from a cookies.txt.
  • --max-distance: Maximum allowed distance from base URL (if not specified then there is no limitation).
  • --max-external-distance: Limit the external (disjoint) distance from the base URL.
  • --no-dedupe: Do not filter duplicate URLs (can result in a non-terminating process).
  • --output=out.json: Output JSON report for each URL to given file (truncates existing content).
  • --publisher=csv: Set the publisher (defaults to json) can be either json or csv.
  • --rate: Set a maximum number of requests to make in a second.
  • --stdout: Stream to STDOUT directly, disables display and any specified outfile.

Examples

Crawl a single website

$ fink http://www.example.com --max-external-distance=0

Crawl a single website and check the status of external links

$ fink http://www.example.com --max-external-distance=1

Use jq to analyse results

jq is a tool which can be used to query and manipulate JSON data.

$ fink http://www.example.com -x0 -oreport.json
$ cat report.json| jq -c '. | select(.status==404) | {url: .url, referrer: .referrer}' | jq

Crawl pages behind a login

# create a cookies file for later re-use (simulate a login in this case via HTTP-POST)
$ curl -L --cookie-jar mycookies.txt -d username=myLogin -d password=MyP4ssw0rd https://www.example.org/my/login/url

# re-use the cookies file with your fink crawl command
$ fink https://www.example.org/myaccount --load-cookies=mycookies.txt

note: its not possible to create the cookie jar on computer A, store it and read it in again on e.g. a linux server. you need to create the cookie file from the very same ip, because otherwise server side session handling might not continue the http-session because of a IP mismatch

Exit Codes

  • 0: All URLs were successful.
  • 1: Unexpected runtime error.
  • 2: At least one URL failed to resolve successfully.

More Repositories

1

what-changed

Generate change reports when you update composer
PHP
113
star
2

phpcr-migrations-bundle

Symfony Bundle for integrating the PHPCR migrations library
PHP
50
star
3

glob

Library offering object location from hierarchrical persistent storage systems using globs
PHP
48
star
4

gherkin-lint-php

PHP Gherkin Linter
PHP
32
star
5

code-mover

Library for code migration/refactoring automization
PHP
22
star
6

maestro

PHP Package co-ordinator
PHP
19
star
7

maestro2

Repository Management System for PHP
PHP
15
star
8

pttlog

Plain Text Time Logger
Rust
9
star
9

Trainer

Symfony 2 / Doctrine MongoDB app for tracking sport activity
PHP
8
star
10

sf-http-cache-tagging

Middleware for tag invalidation with the Symfony HTTP cache.
PHP
7
star
11

vim-phpnamespace

Insert namespace for current file
PHP
7
star
12

sphinx-behat

Sphinx extension for generating Behat features
Python
7
star
13

PhpcrTaxonomyBundle

Taxonomy Bundle for PHPCR
PHP
6
star
14

skeletor

Skeletor Project Generator
PHP
6
star
15

sfdc

Symfony Fragment Documentation Converter
PHP
5
star
16

TrainerBundle

Core bunde for Trainer app
PHP
3
star
17

dtlweb

New Homepage developed in parallel with tutorials
HTML
3
star
18

invoke

Utility to invoke class methods using named parameters
PHP
3
star
19

Voyager

Web Application to track my bike touring, written with Symfony2
PHP
3
star
20

dotfiles

Vim Script
3
star
21

drupal_alice_fixtures

Generative Fixtures for Drupal via. Alice
PHP
3
star
22

Ghagr

Github Issue Aggregator and Reporter
JavaScript
3
star
23

tutorial-basic-cms

Basic Cms CMF Tutorial Application
PHP
3
star
24

php-diagram

PHP
2
star
25

p-meter

Example Performance testing tool
PHP
2
star
26

phpcr-generator

Fixture generator for PHPCR
PHP
2
star
27

arch-cop

ArchCop examples
PHP
2
star
28

awful-ci-sfugberlin

An Awful Local CI Runner
PHP
2
star
29

ObjectInfoBundle

CMF Bundle for inferring information from objects, e.g. a title, edit/view/delete URLs, icons, etc.
PHP
2
star
30

MVC

Really lightweight MVC framework
PHP
2
star
31

jackalope2

Modular PHPCR Implementation
PHP
2
star
32

DCMSOLD

JavaScript
2
star
33

bolt-fixtures

Fixtures loader for Bolt CMS
PHP
2
star
34

Fixturator

Project to automatically extract a fixture set from an existing database based on foreign key relationships
PHP
1
star
35

doctrine-cr

Doctrine Content Repository
PHP
1
star
36

cmf-resource-admin

Symfony CMF Admin using Sylus ResourceBundle and friends.
PHP
1
star
37

phptui

PHP TUI
PHP
1
star
38

strava-rs

Strava TUI
Rust
1
star
39

DanSync

Small script to syncronize directories to a remote FS using GIT in realtime.
Python
1
star
40

blog

Blog
JavaScript
1
star
41

symfony-form-array-to-delimited-string-transformer

Symfony Form Data Transformer which transforms arrays to delimited strings and vice-versa
PHP
1
star
42

NodeMapper

Node Mapper for PHPCR
PHP
1
star
43

eZSummerCampCMF2014

CMF Walkthrough for eZ Summer Camp 2014
PHP
1
star
44

phpcr-nodetype-serializer

Support for importing and exporting node types from various formats
PHP
1
star
45

phpcr-benchmark

Benchmarking suite for PHPCR
PHP
1
star
46

spryker-fixtures

Spryker Fixtures Loader
PHP
1
star
47

voyager2

Another travel app to track my bike tour, powered by symfony-cmf
JavaScript
1
star
48

travel-blog

travel-blog
PHP
1
star
49

symfony-doctrine-phpcr-edition

Symfony Distribution based on the Standard Edition but featuring the PHPCR-ODM instead of the ORM
PHP
1
star
50

freelancer

Freelance project manager
PHP
1
star
51

2019-exploring-async

Examples for Async PHP Talk
PHP
1
star
52

Config

My home config files....
Vim Script
1
star
53

tagged-symfony-cache

Middleware for tag purging for the Symfony HTTP cache
PHP
1
star