• Stars
    star
    157
  • Rank 237,076 (Top 5 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 4 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code relating to scraping public police data.

Welcome!

This is the GitHub home for the Police Data Accessibility Project. We're assembling a toolkit and space for shared resources. People all over the country use these resources to collect public records about the U.S. criminal legal system.

This repository is also a guide to the countless ways we use scraper code to access data. (What do we mean by web scraper?)

Note: This repo is a work in progress, especially the structure of its utilities. Trust what you read here!

How to run a scraper

Right now, this requires some Python knowledge and patience. We're in the early stages: there's no automated scraper farm or fancy GUI yet. Scrapers can be run locally as needed.

  1. Install Python. Prefer a differently opinionated guide? Perhaps this is more your speed.
  2. Clone this repo.
  3. Find the scraper you wish to run. These are sorted geographically, so start by looking in /USA/....
  4. Follow the instructions in the scraper's README to get going. (If it's broken or simply out of date, please open an issue in this repo or submit a PR.)

Sharing back to the PDAP community

If you do something cool or interesting or fun with your shiny new data, share that in our Discord. Want to kick around an idea or share something that doesn't work as expected? Discord's a great place for that, too.

How to contribute

To write a scraper, start with CONTRIBUTING.md. Be sure to check out the /common folder!

For everything else, start with docs.pdap.io.

Resources

Here are some potentially useful tools. If you want to make additions or updates, you can edit the docs in GitHub!