• Stars
    star
    319
  • Rank 131,491 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 11 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Scrapes an ESRI MapServer REST endpoint to spit out more generally-usable geodata.

esri-dump

Scrapes an Esri REST endpoint and writes a GeoJSON file.

Installation

If you just want to use the command line tool esri2geojson, the recommended way to install this package is to create a virtual environment and install it there. This method does not require that you git clone this repository and can get you up and running quickly:

virtualenv esridump
source esridump/bin/activate
pip install esridump

Usage

Command line

This module will install a command line utility called esri2geojson that accepts an Esri REST layer endpoint URL and a filename to write the output GeoJSON to:

esri2geojson http://cookviewer1.cookcountyil.gov/ArcGIS/rest/services/cookVwrDynmc/MapServer/11 cookcounty.geojson

You can write to stdout by using the special output filename of - (a single dash character).

You can also pass in the --jsonlines option to write newline-separated (\n) lines of GeoJSON features, which you can then pipe into other applications.

Python module

You can use this module in your code to get GeoJSON Feature-shaped Python dicts into your code:

import json
from esridump.dumper import EsriDumper

d = EsriDumper('http://example.com/arcgis/rest/services/Layer/MapServer/1')

# Iterate over each feature
for feature in d:
    print(json.dumps(feature))

d = EsriDumper('http://example.com/arcgis/rest/services/Layer/MapServer/2')

# Or get all features in one list
all_features = list(d)

Methodology

The module will do its best to find the most efficient method of retrieving data from the Esri server, given the capabilities of the server. There are several strategies we use to get the data, described here in most to least efficient order:

resultOffset Pagination

In ArcGIS REST API version 10.3, Esri added support for pagination directly with the resultOffset and resultRecordCount parameters. Unfortunately, most servers don't support this feature because the backend SQL engine must also be configured to support it. So far, it seems that only the Esri-hosted layers support this feature reliably.

objectId Field Chunking

In ArcGIS REST API version 10.0, Esri added support for the server to return an exhaustive list of object IDs for all features in a layer. Once this list of object IDs is retrieved, we break it into chunks of maxRecordCount queries using the objectIds parameter.

objectId Statistics where-clauses

In ArcGIS REST API version 10.1, Esri added support for performing various statistical queries on the server without requiring the client to download the whole dataset. On servers that support this and don't respond to the objectIds queries, we will use a minimum and maximum statistics query to find the minimum and maximum values for the objectId column, then build chunks of where-clauses that narrow the range down to objectIds between two fenceposts.

Geometry Quadtree Queries

When a server does not support any of these methods, we'll make recursive quad-tree queries using bounding envelopes. We start with a query for the layer's entire extent. If the server returns exactly the maxRecordCount number of features, we split that extent into 4 equal rectangles and query those. If those smaller queries return maxRecordCount features, we split the rectangle again and continue until the server returns something less than the maxRecordCount.

Development

To suggest changes or improvements to this code, create a fork on Github and clone your repository locally:

git clone [email protected]:openaddresses/pyesridump.git # replace with your fork
cd pyesridump

We use Pipenv to manage dependencies for development. Make sure you have Pipenv installed and then install the dependencies for development:

pipenv install --dev
pipenv shell

Your changes to the code will be reflected when you run the esri2geojson command from within the virtual environment. You can also run (and add) tests to check that your changes didn't break anything:

nosetests

See Also

This Python module was extracted from OpenAddresses machine, which was inspired by code from koop. A similar node/JavaScript module is available in esri-dump.

More Repositories

1

openaddresses

A global repository of open address, building, and parcel data.
JavaScript
2,817
star
2

machine

Scripts for running OpenAddresses on a complete data set and publishing the results.
Python
98
star
3

esri-dump

A Node module to assist with pulling data out of an ESRI ArcGIS REST server into GeoJSON.
TypeScript
83
star
4

openaddresses.io

Repo for the openaddresses.io website.
CSS
26
star
5

centerlines

A repository of street centerline datasets to aid in improving OpenStreetMap.
23
star
6

TileBase

Range based Single File MBTiles like Store
JavaScript
17
star
7

population

Data and experiments with world population densities for comparison to addresses
Python
13
star
8

deploy

Cloudformation Deploy Tool for OpenAddresses
JavaScript
11
star
9

openbuildings

A repository of global open building polygon data
11
star
10

esri-indexer

A little web app that indexes geographic data layers available via ESRI REST endpoints so they are searchable.
Python
11
star
11

oa2osm

OpenAddresses to OpenStreetMap
JavaScript
10
star
12

openaddresses-ops

Issues-only repo for discussion of operational considerations for OA
6
star
13

submit-ui

Web app for submitting data to OpenAddresses
JavaScript
6
star
14

dedupe

Code for deduplicating OpenAddresses records
Python
6
star
15

pelias-ubuntu

Ubuntu 16.04 meta-package for Pelias geocoder
4
star
16

claypigeonfs

FUSE filesystem for remote ranged HTTP files
Python
4
star
17

batch

OpenAddresses/Machine based AWS Batch based ETL Processing
JavaScript
4
star
18

submit-service

Middleware service that samples and submits OA PRs
JavaScript
3
star
19

batch-machine

Python scripts to download and process a single source
Python
1
star
20

dotmaps

Experimenting with slippy dotmaps for OA data
Python
1
star
21

pelias-openaddresses-ubuntu

Ubuntu package for Pelias geocoder OpenAddresses import pipeline
JavaScript
1
star
22

nothing-to-see-here

Throwaway repo to figure out some CI annoyances with Machine
1
star
23

pelias-frontpage-ubuntu

Ubuntu package for Pelias geocoder HTML front page
HTML
1
star
24

pelias-api-ubuntu

Ubuntu package for Pelias geocoder API
JavaScript
1
star