• Stars
    star
    251
  • Rank 160,959 (Top 4 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created almost 10 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

StreamFlow™ is a stream processing tool designed to help build and monitor processing workflows.

StreamFlow™

Build Status

Join the chat at https://gitter.im/lmco/streamflow

Overview

StreamFlow™ is a stream processing tool designed to help build and monitor processing workflows. The ultimate goal of StreamFlow is to make working with stream processing frameworks such as Apache Storm easier, faster, and with "enterprise" like management functionality.
StreamFlow also provides a mechanism for non-developers such as data scientists, analysts, or operational users to rapidly build scalable data flows and analytics.

Sample topology

StreamFlow provides the following capabilities:

  1. A responsive web interface for building and monitoring Storm topologies.
  2. An interactive drag and drop topology builder for authoring new topologies
  3. A dashboard for monitoring the status and performance of topologies as well as viewing aggregated topology logs.
  4. A specialized topology engine which solves some Storm complexities such as ClassLoader isolation and serialization and provides a mechanism for dependency injection.
  5. A modular framework for publishing and organizing new capabilities in the form of Spouts and Bolts.

How it works

The following is a simple depiction of the StreamFlow stack. The web interface is built using open source web frameworks and is backed by a series of reusable web services. StreamFlow is capable of authoring and managing topologies dynamically using a series of reusable Frameworks. These Frameworks are simply JAR files comprised of standard Storm Spouts and Bolts with a metadata configuration file which exposes the frameworks. StreamFlow utilizes a custom topology driver which is used to bootstrap and execute a topology along with StreamFlow specific configuration logic.

Concepts

The following is a description of some core StreamFlow concepts and terminology.

Component

Components represent business logic modules which are draggable in the StreamFlow UI. Examples of Components include Storm Spouts and Storm Bolts.

Framework

A grouping of related Components and their associated metadata. Ideally elements of a framework should all be compatible when wired together on a topology as they share the same protocol. Frameworks might be organized around a set of technologies or domains. An analogy would be a Java Library or Objective C Framework. Topologies have frameworks as dependencies.

Resource

A resource is an object used by spouts/bolts in order to externalize common state. For example, an object which represents a technical asset in the environment/cluster such as a database or Kafka queue. Alternatively, a resource might provide an uploaded file or container of global state. Resources should be used to encapsulate functionality outside of a bolt/spout if that information is used in several places in a topology or within multiple topologies. Resources also provide a useful mechanism for injecting parameters, connections, or state into a bolt/spout making the spout or bolt simpler, easier to write, and more testable.

Serialization

Seriaizations allow for the definition of custorm serializers/deserializers. Specifically these serializations should be specified in the Kryo format to properly integrate with Storm.

Topology

Topologies in Storm define the processing logic and link between nodes to describe the data flow. StreamFlow utilizes registered components to allow users to dynamically build topologies in a drag and drop interface. This allows topologies to be built using existing components without requiring additional code.

Find out more

The StreamFlow Wiki is the best place to go to learn more about the StreamFlow architecture and how to install and configure a StreamFlow server in your environment.

https://github.com/lmco/streamflow/wiki;

Here are some quick links to help get you started with StreamFlow:

Questions or need help?

If you have any questions or issues please feel free to contact the development team using one of the following methods.

License

StreamFlow is copyright 2014 Lockheed Martin Corporation.

Licensed under the [Apache License, Version 2.0] license (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This product incorporates open source software components covered by the terms of third party license agreements contained in the /Licenses folder of this project.

Documentation Version

Last Updated: 1/7/2015

More Repositories

1

laikaboss

Laika BOSS: Object Scanning System
Python
725
star
2

dart

DART is a test documentation tool created by the Lockheed Martin Red Team to document and report on penetration tests, especially in isolated network environments.
Python
258
star
3

eurekastreams

A new communications experience for the enterprise
Java
169
star
4

vortex-ids

Vortex is a near real time IDS and network surveillance engine for TCP stream data.
C
97
star
5

hoppr-cop

Hoppr Cop is a cli and python library that generates high quality vulnerability information from a cyclone-dx Software Bill of Materials (SBOM) by aggregating data from multiple vulnerability databases. This project is a mirror from gitlab
Python
19
star
6

tlrb_aib_phy

TLRB AIB PHY RTL
SystemVerilog
7
star
7

python-mongor

Distributed database expansion to MongoDB designed to optimize scale-out, write intensive document storage
Python
7
star
8

ChaordicLedger

The ChaordicLedger is the implementation of a design for a combination of Distributed Ledger Technology (DLT) and a Distributed File System (DFS) to create a secure, enterprise-grade platform for storing interlinked project artifacts.
Shell
6
star
9

axi4_aib_bridge

AXI4/AIB Bridge RTL
SystemVerilog
5
star
10

rabid.mongoose

A REST interface for python-mongor
Python
5
star
11

lm-mit-momentum22

LM provided files for MIT Momentum 22 (https://ome.mit.edu/programs/momentum).
Python
3
star
12

parselab

parseLab is a tool designed to generate protocol parsers and fuzz messages, along with a framework to implement custom protocol parser generators for various parsing backends
Python
2
star
13

StreamlinedML

2
star
14

mbee-plugin-sandbox

JavaScript
1
star
15

eurekastreams-mvn-repo

1
star
16

PyFlyt

UAV Flight Simulator Gymnasium Environments
Python
1
star
17

DecisionMamba

Decision transformer with the Mamba architecture for offline RL w/ online fine-tuning
Jupyter Notebook
1
star
18

duckdb

DuckDB is an analytical in-process SQL database management system
C++
1
star
19

flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Go
1
star