• Stars
    star
    105
  • Rank 326,342 (Top 7 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 9 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Processing framework for containerized algorithms

Scale

Join the chat at https://gitter.im/ngageoint/scale Build Status

Scale is a system that provides management of automated processing on a cluster of machines. It allows users to define jobs, which can be any type of script or algorithm. These jobs run on ingested source data and produce product files. The produced products can be disseminated to appropriate users and/or used to evaluate the producing algorithm in terms of performance and accuracy.

Mesos and Nodes

Scale runs across a cluster of networked machines (called nodes) that process the jobs. Scale utilizes Apache Mesos, a free and open source project, for managing the available resources on the nodes. Mesos informs Scale of available computing resources and Scale schedules jobs to run on those resources.

Ingest

Scale ingests source files using a Scale component called Strike. Strike is a process that monitors an ingest directory into which source data files are being copied. After a new source data file has been ingested, Scale produces and places jobs on the queue depending on the type of the ingested file. Many Strike processes can be run simultaneously, allowing Scale to monitor many different ingest directories.

Jobs

Scale creates jobs based on its known job types. A job type defines key characteristics about an algorithm that Scale needs to know in order to run it (what command to run, the algorithm.s inputs and outputs, etc.) Job types are labeled with versions, allowing Scale to run multiple versions of the same algorithm. Jobs may be created automatically due to an event, such as the ingest of a particular type of source data file, or they may be created manually by a user. Jobs that need to be executed are placed onto and prioritized within a queue before being scheduled onto an available node. When multiple jobs need to be run in a serial or parallel sequence, a recipe can be created that defines the job workflow.

Products

Jobs can produce products as a result of their successful execution. Products may be disseminated to users or used to analyze and improve the algorithms that produced them. Scale allows the creation of different workspaces. A workspace defines a separate location for storing source or product files. When a job is created, it is given a workspace to use for storing its results, allowing a user to control whether the job.s results are available to a wider audience or are restricted to a private workspace for the user's own use.

Scale Dependencies

Scale requires several external components to run as intended. PostgreSQL is used to store all internal system state and must be accessible to both the scheduler and web server processes. Fluentd along with Elasticsearch are used to collect and store all algorithm logs. A message broker is required for in-flight storage of internal Scale messages and must be accessible to all system components. The following versions of these services are required to support Scale:

  • Elasticsearch 6.6.2
  • Fluentd 1.4
  • PostgreSQL 9.4+
  • PostGIS 2.0+
  • Message Broker (RabbitMQ 3.6+ or Amazon SQS)

Note: We strongly recommend using managed services for PostgreSQL (AWS RDS), Messaging (AWS SQS) and Elasticsearch (AWS Elasticsearch Service), if available to you. Use of these services in Docker containers should be avoided in all but development environments. Reference the Architecture documentation for additional details on configuring supporting services.

Quick Start

While Scale can be entirely run on a pure Apache Mesos cluster, we strongly recommend using Data Center Operating System (DC/OS). DC/OS provides service discovery, load-balancing and fail-over for Scale, as well as deployment scripts for nearly all imaginable target infrastructures. This stack allows Scale users to focus on use of the framework while minimizing effort spent on deployment and configuration. A complete quick start guide can be found at:

https://ngageoint.github.io/scale/quickstart.html

Algorithm Development

Scale is designed to allow development of recipes and jobs for your domain without having to concern yourself with the complexities of cluster scheduling or data flow management. As long as your processing can be accomplished with discrete inputs on a Linux command line, it can be run in Scale. Simple examples of a complete processing chain can be found within the above quick start or you can refer to our in-depth documentation for step-by-step Scale integration:

https://ngageoint.github.io/scale/docs/algorithm_integration/index.html

Scale Development

If you want to contribute to the actual Scale open source project, we welcome your contributions. There are 2 primary components of Scale:

The links provide specific development environment setup instructions for each individual component.

Build

Scale is tested and built using a combination of Travis CI and Docker Hub. All unit test execution and documentation generation are done using Travis CI. We require that any pull request fully pass unit test checks prior to being merged. Docker Hub builds are saved to x.x.x-snapshot image tags between releases and on release tags are matched to release version.

A new release can be cut using the generate-release.sh shell script from a cloned Scale repository (where numbers refer to MAJOR MINOR PATCH versions respectively):

./generate-release.sh 4 0 0 

There is no direct connection between the Travis CI and Docker Hub builds, but both are launched via push to the GitHub repository.

Contributing

Scale was developed at the National Geospatial-Intelligence Agency (NGA). The government has "unlimited rights" and is releasing this software to increase the impact of government investments by providing developers with the opportunity to take things in new directions. The software use, modification, and distribution rights are stipulated within the Apache 2.0 license.

All pull request contributions to this project will be released under the Apache 2.0 or compatible license. Software source code previously released under an open source license and then modified by NGA staff is considered a "joint work" (see 17 USC ยง 101); it is partially copyrighted, partially public domain, and as a whole is protected by the copyrights of the non-government authors and must be released according to the terms of the original open source license.

More Repositories

1

geoq

Django web application to collect geospatial features and manage feature collection among groups of users
JavaScript
652
star
2

hootenanny

Hootenanny conflates multiple maps into a single seamless map.
JavaScript
337
star
3

geopackage-js

GeoPackage JavaScript Library
TypeScript
304
star
4

sarpy

A basic Python library to demonstrate reading, writing, display, and simple processing of complex SAR data using the NGA SICD standard.
Python
244
star
5

gamification-server

Server to track gamification elements (badges, points, tags) to work pages or apps
JavaScript
239
star
6

MATLAB_SAR

A basic MATLAB library to demonstrate reading, writing, display, and simple processing of complex SAR data using the NGA SICD standard.
MATLAB
209
star
7

mrgeo

MrGeo is a geospatial toolkit designed to provide raster-based geospatial capabilities that can be performed at scale. MrGeo is built upon Apache Spark and the Hadoop ecosystem to leverage the storage and processing of hundreds of commodity computers. See the wiki for more details.
Java
199
star
8

opensphere

OpenSphere
JavaScript
183
star
9

elasticgeo

ElasticGeo provides a GeoTools data store that allows geospatial features from an Elasticsearch index to be published via OGC services using GeoServer.
Java
167
star
10

fog-machine

iOS Swift framework for parallel processing
Swift
121
star
11

geopackage-android

GeoPackage Android Library
Java
87
star
12

mage-server

Mobile Awareness GEOINT Environment Server
TypeScript
85
star
13

geopackage-java

GeoPackage Java Library
Java
77
star
14

tiff-java

Tagged Image File Format Java Library
Java
72
star
15

six-library

Sensor Independent XML Library
C++
70
star
16

social-media-picture-explorer

Backend for social-media-picture-explorer-ui, a tool for using deep learning to interactively explore social media
Jupyter Notebook
52
star
17

geopackage-ios

GeoPackage iOS Library
Objective-C
50
star
18

mage-android

Mobile Awareness GEOINT Environment Android
Kotlin
40
star
19

geoevents

The GeoEvents project is a dynamic and customizable open source web presence that provides a common operational picture to consolidate activities, manage content, and provides a single point of discovery. GeoEvents was used by deployers and first responders in over 100 real-world events.
JavaScript
40
star
20

MAGE

Main Page for the Mobile Awareness GEOINT Environment
JavaScript
39
star
21

GeoPackage

Main Page for NGA GeoPackage Efforts
39
star
22

geopackage-mapcache-android

GeoPackage MapCache Android App
Python
35
star
23

mage-ios

Mobile Awareness GEOINT Environment iOS
Swift
34
star
24

simple-features-geojson-java

Simple Features GeoJSON Java Library
Java
33
star
25

geopackage-android-map

GeoPackage Android Map Library
Java
33
star
26

geopackage-core-java

GeoPackage Core Java Library
Java
31
star
27

geoint-standards

co-create and grow GEOINT standards transparenlty
HTML
30
star
28

tiff-ios

Tagged Image File Format iOS Library
Objective-C
29
star
29

mapcache-electron

Desktop application for creating and editing GeoPackages
JavaScript
29
star
30

opensphere-desktop

opensphere-desktop
Java
29
star
31

hootenanny-ui

Hootenanny UI is a submodule of the Hootennany vector conflation project.
JavaScript
28
star
32

voxel-globe

calibrates aerial camera models and constructs 3D models from video sequences
Python
26
star
33

endpoint.js

Web application discovery, execution and streaming library
JavaScript
26
star
34

geoq-chef-installer

Chef recipes and configuration files to install the 'geoq' app onto a Virtual Machine
Ruby
25
star
35

map-of-world-api

Map of the World API supports multiple web-based mapping libraries and provides a consistent set of methods for interacting with any supported implementations
JavaScript
25
star
36

Nounalyzer

Analyze the nouns and entities in a rss feed
HTML
21
star
37

mapcache-server

MapCache Server
JavaScript
21
star
38

geopackage-mapcache-ios

GeoPackage MapCache iOS App
Objective-C
21
star
39

simple-features-wkb-java

Simple Features Well-Known Binary Java Library
Java
20
star
40

state-of-the-data

content suitability assessment tools
Python
19
star
41

social-media-picture-explorer-ui

A user interface to explore social media more graphically
JavaScript
19
star
42

Rational-Polynomial-Coefficients-Mapper

C++ class that uses RPC coefficients to map an object space coordinate represented in Latitude, Longitude, and Altitude to a sensor position represented in X,Y
C++
19
star
43

csm

Community Sensor Model
C++
18
star
44

mgrs-java

Military Grid Reference System Java Library
Java
18
star
45

leaflet-geopackage

Leaflet GeoPackage
JavaScript
18
star
46

anti-piracy-android-app

Anti-Shipping Activity Messages (ASAM) App for Android displays location and descriptive information about hostile acts against ships and mariners. The app caches warning data and works without a Wi-Fi or cellular connection.
Java
18
star
47

conduit

content curation tool
JavaScript
17
star
48

simple-features-java

Simple Features Java Library
Java
16
star
49

sarpy_apps

Python
16
star
50

anti-piracy-iOS-app

Anti-Shipping Activity Messages (ASAM) App for iOS displays location and descriptive information about hostile acts against ships and mariners. The app caches warning data and works without a Wi-Fi or cellular connection.
Swift
16
star
51

mage-ios-sdk

Mobile Awareness GEOINT Environment iOS SDK
Objective-C
15
star
52

geopackage-viewer-js

JavaScript
15
star
53

geoint-in-motion

data comparison tools written in python
Python
14
star
54

disconnected-content-explorer-iOS

Disconnected Interactive Content Explorer (DICE) is an app for iOS, Android, and Windows that allows users to load interactive content generated in HTML, CSS, and Javascript to a mobile device so the device can display interactive content without a network connection.
Objective-C
14
star
55

disconnected-content-explorer-android

Disconnected Interactive Content Explorer (DICE) is an app for iOS, Android, and Windows that allows users to load interactive content generated in HTML, CSS, and Javascript to a mobile device so the device can display interactive content without a network connection.
Java
13
star
56

color-java

Color Java Library
Java
13
star
57

keycloak-login.gov-integration

HTML
12
star
58

rfi-generator

The RFI Generator helps first responders and HQ analysts work Requests for Information (RFIs) within a geospatial context.
JavaScript
12
star
59

simple-features-proj-java

Simple Features Projection Java Library
Java
11
star
60

mgrs-ios

Military Grid Reference System iOS Library
Swift
10
star
61

Geospatial-Analysis-Integrity-Tool

The Geospatial Analysis Integrity Tool (GAIT) validates data against a data model. GAIT checks geometry, feature codes, attribute values and domains, and metadata. The tool writes its results as line and point shapefiles to an output directory. GAIT can execute against data in MGCP, GIFD, TDS, and VMap data models.
C
10
star
62

Sensor_Integration_Framework

The purpose of this document is to provide guidance required for sensor data producers and consumers to implement a sensor information enterprise that meets operational requirements, achieves United States (U.S.) Department of Defense (DoD) and Intelligence Community (IC) Chief Information Officer (CIO) goals, and conforms to applicable policy.
10
star
63

mrgeo-geoserver-plugin

Java
9
star
64

cocreate

Open source environment for development, integration and testing
Python
9
star
65

opensphere-asm

opensphere-asm
JavaScript
9
star
66

opensphere-electron

Run OpenSphere in an Electron container.
JavaScript
9
star
67

Spectral-Library-Reader

C++ Library that reads the splib06a file, which is a custom binary spectral reflectance database file created by USGS
C++
8
star
68

opensphere-yarn-workspace

opensphere-yarn-workspace
Dockerfile
8
star
69

mgrs-android

Military Grid Reference System Android Library
Java
8
star
70

ogc-api-features-json-java

OGC API Features JSON Java Library
Java
8
star
71

wedge-maker-4-gis

An ArcGIS Python toolbox for creating wedge and arcband shapes
Python
8
star
72

projections-ios

Projections iOS Library
Objective-C
7
star
73

SWIRSignalDetection

analyzes shortwave infrared reflectance
Cuda
7
star
74

geogig

Java
7
star
75

mgrs-js

Military Grid Reference System Javascript Library
TypeScript
7
star
76

tk_builder

Python
7
star
77

coordinate-reference-systems-java

Coordinate Reference Systems Java Library
Java
7
star
78

scale-ui

UI front-end for Scale - Processing framework for containerized algorithms
TypeScript
6
star
79

DigitalGlobeReader

C++
6
star
80

geopackage-geojson-js

GeoPackage GeoJSON Converter
JavaScript
6
star
81

mage-android-wear

Mobile Awareness GEOINT Environment Android Wear
Java
6
star
82

seed

Standard for discovery and consumption of Docker containerized jobs.
SCSS
6
star
83

grid-js

Grid Javascript Library
TypeScript
6
star
84

geowave-osm

OSM Data processing for GeoWave
Java
5
star
85

mage-android-wear-bridge

MAGE Android Wear Bridge
Java
5
star
86

hootenanny-rpms

RPMs needed for a Hootenanny install
Shell
5
star
87

simple-features-wkb-ios

Simple Features Well-Known Binary iOS Library
Objective-C
5
star
88

simple-features-wkt-java

Simple Features Well-Known Text Java Library
Java
5
star
89

mage-chronostouch-android

Mobile Awareness GEOINT Environment Chronostouch Android
Java
5
star
90

seed-silo

Rest API for discovering Seed images
Go
5
star
91

seed-cli

Algorithm developer CLI supporting Seed compliant image publish and testing.
Go
5
star
92

geowave-vagrant

Vagrant environment for geowave development.
Shell
5
star
93

marlin-ios

Swift
4
star
94

geogig-qgis-client-plugin

Python
4
star
95

ogc-api-features-json-ios

OGC API Features JSON iOS Library
Objective-C
4
star
96

opensphere-plugin-example

opensphere-plugin-example
JavaScript
4
star
97

geoevents-chef-installer

This is a set of Chef recipes (think of them as macros to automatically build a running Virtual Machine) that will work to set the geoevents app up on either a local Virtualbox VM or onto an Amazon Web Service VM.
Ruby
4
star
98

gars-java

Global Area Reference System Java Library
Java
3
star
99

opensphere-build-index

opensphere-build-index
JavaScript
3
star
100

grid-ios

Grid iOS Library
Swift
3
star