• Stars
    star
    151
  • Rank 237,225 (Top 5 %)
  • Language
    Ruby
  • License
    MIT License
  • Created over 10 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Data Analysis Pipeline

DAP: The Data Analysis Pipeline

Gem Version

DAP was created to transform text-based data on the command-line, specializing in transforms that are annoying or difficult to do with existing tools.

DAP reads data using an input plugin, transforms it through a series of filters, and prints it out again using an output plugin. Every record is treated as a document (aka: hash/dict) and filters are used to reduce, expand, and transform these documents as they pass through. Think of DAP as a mashup between sed, awk, grep, csvtool, and jq, with map/reduce capabilities.

DAP was written to process terabyte-sized public scan datasets, such as those provided by https://scans.io/. Although DAP isn't particularly fast, it can be used across multiple cores (and machines) by splitting the input source and wrapping the execution with GNU Parallel.

Installation

Prerequisites

DAP requires Ruby and is best suited for systems with a relatively current version with 2.6.x being the minimum requirement. Ideally, this will be managed with either rbenv or rvm with the bundler gem also installed and up to date. Using system managed/installed Rubies is possible but fraught with peril.

Maxmind IP Location Databases

If you intend on using any of the geo_ip* or geo_ip2* filters, you must install the databases that provide the data for these filters. If you do not intend on using these filters, you can skip this step.

dap versions 1.4.x and later depend on Maxmind's geoip2/geolite2 databases to be able to append geographic and related metadata to analyzed datasets. In order to use this functionality you must put your copy of the relevant Maxmind databases in the correct location in /var/lib/geoip2 or the data directory of your dap installation or override with an environment variable that specifies the full path to the database in question:

  • ASN: GeoLite2-ASN.mmdb (environment override: GEOIP2_ASN_DATABASE_PATH)
  • City: GeoLite2-City.mmdb (environment override: GEOIP2_CITY_DATABASE_PATH)
  • ISP: GeoIP2-ISP.mmdb (environment override: GEOIP2_ISP_DATABASE_PATH)

NOTE: Prior to dap version 1.4.x there was a dependency on Maxmind's geoip database to be able to append geographic metadata to analyzed datasets. However, since that time Maxmind has dropped support for these legacy databases. If you intend to continue using this deprecated functionality, you must put your copy of the relevant Maxmind legacy databases in the correct location in /var/lib/geoip or the data directory of your dap installation or override with an environment variable that specifies the full path to the database in question:

  • ASN: GeoIPASNum.dat (environment override in 1.4.x+: GEOIP_ASN_DATABASE_PATH)
  • City: geoip_city.dat (environment override in 1.4.x+: GEOIP_CITY_DATABASE_PATH)
  • Org: geoip_org.dat (environment override in 1.4.x+: GEOIP_ORG_DATABASE_PATH)

Ubuntu 16.04+

sudo apt-get install zlib1g-dev ruby ruby-dev gcc make ruby-bundler
gem install dap

OS X

# Install the GeoIP C library required by DAP
brew update
brew install geoip

gem install dap

Usage

In its simplest form, DAP takes input, applies zero or more filters which modify the input, and then outputs the result. The input, filters and output are separated by plus signs (+). As seen from dap -h:

Usage: dap  [input] + [filter] + [output]
       --inputs
       --outputs
       --filters

To see which input/output formats are supported and what filters are available, run dap --inputs,dap --outputs or dap --filters, respectively.

This example reads as input a single IP address from STDIN in line form, applies geo-ip transformations as a filter on that line, and then returns the output as JSON:

$   echo 8.8.8.8 | bin/dap + lines + geo_ip2_city line + json | jq .
{
  "line": "8.8.8.8",
  "line.geoip2.city.city.geoname_id": "0",
  "line.geoip2.city.continent.code": "NA",
  "line.geoip2.city.continent.geoname_id": "6255149",
  "line.geoip2.city.country.geoname_id": "6252001",
  "line.geoip2.city.country.iso_code": "US",
  "line.geoip2.city.country.is_in_european_union": "false",
  "line.geoip2.city.location.accuracy_radius": "1000",
  "line.geoip2.city.location.latitude": "37.751",
  "line.geoip2.city.location.longitude": "-97.822",
  "line.geoip2.city.location.metro_code": "0",
  "line.geoip2.city.location.time_zone": "America/Chicago",
  "line.geoip2.city.postal.code": "",
  "line.geoip2.city.registered_country.geoname_id": "6252001",
  "line.geoip2.city.registered_country.iso_code": "US",
  "line.geoip2.city.registered_country.is_in_european_union": "false",
  "line.geoip2.city.represented_country.geoname_id": "0",
  "line.geoip2.city.represented_country.iso_code": "",
  "line.geoip2.city.represented_country.is_in_european_union": "false",
  "line.geoip2.city.represented_country.type": "",
  "line.geoip2.city.traits.is_anonymous_proxy": "false",
  "line.geoip2.city.traits.is_satellite_provider": "false",
  "line.geoip2.city.continent.name": "North America",
  "line.geoip2.city.country.name": "United States",
  "line.geoip2.city.registered_country.name": "United States"
}

There are also several examples of how to use DAP along with sample datasets here.

More Repositories

1

metasploit-framework

Metasploit Framework
Ruby
31,198
star
2

metasploitable3

Metasploitable3 is a VM that is built from the ground up with a large amount of security vulnerabilities.
HTML
4,454
star
3

metasploit-payloads

Unified repository for different Metasploit Framework payloads
C
1,543
star
4

hackazon

A modern vulnerable web app
HTML
944
star
5

ssh-badkeys

A collection of static SSH keys (public and private) that have made their way into software and hardware products.
790
star
6

IoTSeeker

Created by Jin Qian via the GitHub Connector
Perl
735
star
7

recog

Pattern recognition for hosts, services, and content
Ruby
591
star
8

metasploit-vulnerability-emulator

Created by Jin Qian via the GitHub Connector
Perl
424
star
9

mettle

This is an implementation of a native-code Meterpreter, designed for portability, embeddability, and low resource utilization.
C
398
star
10

meterpreter

THIS REPO IS OBSOLETE. USE https://github.com/rapid7/metasploit-payloads INSTEAD
C
318
star
11

sonar

Project Sonar
234
star
12

metasploit-omnibus

Packaging metasploit-framework with omnibus
Ruby
222
star
13

warvox

Ruby
197
star
14

nexpose-client

DEPRECATED: Rapid7 Nexpose API client library written in Ruby
Ruby
145
star
15

embedded-tools

AGS Script
143
star
16

awsaml

Awsaml is an application for providing automatically rotated temporary AWS credentials.
JavaScript
133
star
17

myBFF

myBFF - a Brute Force Framework
Python
132
star
18

docker-logentries

Forward all your Docker logs to logentries, like a breeze
JavaScript
111
star
19

le_node

Node module for logentries.com
JavaScript
108
star
20

jsobfu

Obfuscate JavaScript (beyond repair) with Ruby
JavaScript
89
star
21

metasploit-javapayload

THIS REPO IS OBSOLETE. USE https://github.com/rapid7/metasploit-payloads INSTEAD
Java
87
star
22

ruby_smb

A native Ruby implementation of the SMB Protocol Family
Ruby
76
star
23

vm-console-client-python

the UNOFFICIAL (but useful) Python library for the Rapid7 InsightVM/Nexpose RESTful API
Python
75
star
24

le_js

Client-side JavaScript logging library for Logentries
JavaScript
73
star
25

vm-automation

Created to simplify interactions with virtual machines
Python
73
star
26

insightconnect-plugins

Plugin source code for the InsightConnect SOAR product, developer documentation at https://docs.rapid7.com/insightconnect/getting-started
Python
62
star
27

conqueso

Centrally and dynamically change configuration values of your services!
JavaScript
55
star
28

smbj-rpc

Created by Paul Miseiko via the GitHub Connector
Java
54
star
29

metasploit_data_models

MSF database code, gemified
Ruby
51
star
30

rex-powershell

Rex library for dealing with Powershell Scripts
Ruby
51
star
31

DLLHijackAuditKit

This toolkit detects applications vulnerable to DLL hijacking (released in 2010)
JavaScript
51
star
32

metasploit-aggregator

Created by Jeffrey Martin via the GitHub Connector
Ruby
50
star
33

go-get-proxied

Cross platform retrieval of system proxy configurations
Go
49
star
34

insightvm-sql-queries

InsightVM helpful SQL queries
49
star
35

rex

Rex provides a variety of classes useful for security testing and exploit development.
Ruby
48
star
36

le_ruby

Ruby logging support for logentries.com
Ruby
47
star
37

data

HTML
42
star
38

convection

A fully generic, modular DSL for AWS CloudFormation
Ruby
41
star
39

github-connector

The GitHub Active Directory Connector allows managing GitHub organizations with Active Directory.
Ruby
34
star
40

krip

Dead simple encryption, using WebCrypto under the hood
JavaScript
33
star
41

le_dotnet

.NET support for Logentries
C#
30
star
42

le_java

Direct logging support for Java language
Java
30
star
43

metasploit-credential

Code for modeling and managing credentials in Metasploit, implemented as a Rails Engine
Ruby
29
star
44

re2-java

re2 for Java
Java
27
star
45

vaccination

C
26
star
46

metasploit-baseline-builder

Created by Jeffrey Martin via the GitHub Connector
Python
25
star
47

nexpose-client-python

DEPRECATED : Rapid7 Nexpose API client library written in Python
Python
25
star
48

nexpose-resources

Scripts, SQL queries, and other resources for Nexpose
Ruby
25
star
49

metabot

Use security tools from within IRC.
Ruby
24
star
50

pdf-renderer

Golang based app that will render an html page and create a pdf.
Go
24
star
51

lecli

Seamlessly view recent events, run queries and manage your account from the command line
Python
23
star
52

msfrpc-client

Rapid7 Metasploit API client library written in Ruby
Ruby
23
star
53

propsd

Dynamic property management at scale
JavaScript
22
star
54

builderator

Tools to make CI Packer builds awesome
Ruby
22
star
55

sonar-client

Ruby
21
star
56

savery

JavaScript
21
star
57

tabtalk

Secure, encrypted cross-tab communication in the browser
JavaScript
21
star
58

le_chef

Ruby
20
star
59

Websploit-Tests

A place for scripts that describe web exploits to live so they can be used in testing
PHP
20
star
60

rex-text

Rex library for text generation and manipulation
Ruby
19
star
61

rex-exploitation

Rex library for various exploitation helpers
Ruby
19
star
62

godap

The Data Analysis Pipeline
Go
18
star
63

memorandom

Ruby
18
star
64

dogwatch

A Ruby DSL to create DataDog monitors.
Ruby
17
star
65

akheron-proxy

UART proxy tool for inter-chip analysis.
Python
16
star
66

metakitty

Metakitty, The Metasploit Resource Portal
Ruby
16
star
67

guardian

A lightweight authentication proxy for HTTP services
JavaScript
16
star
68

le_community_packs

Logentries Community Packs
VCL
16
star
69

le_lambda

Python
16
star
70

FullAutoOSINT

Python
15
star
71

fastlib

FastLib provides a "jar-like" format for Ruby libraries, with specific features for the Metasploit Framework
Ruby
15
star
72

react-prefixer

JavaScript
15
star
73

nexpose_java_api

DEPRECATED : A library used to connect to the Nexpose API
Java
15
star
74

raptor-io

The eventual successor to the networking/IO functionalities of Metasploit's REX library
Ruby
13
star
75

geppetto

Geppetto - Virtual machine and infrastructure orchestration
Python
13
star
76

presales-engineering

Shell
13
star
77

marionette.carpenter

A thing that makes tables
JavaScript
13
star
78

docker-image-analyzer

docker image analyzer
Java
13
star
79

tokend

A Node.js daemon that interfaces with Vault and Warden to provide a secure method to deliver secrets to servers in the cloud.
JavaScript
13
star
80

rex-socket

The Rex Socket Abstraction Library
Ruby
12
star
81

psych_shield

PsychShield provides a filtering mechanism for YAML.load when using the Psych parser
Ruby
12
star
82

appspider-pentestkit

Created by Denis Podgurskiy via the GitHub Connector
JavaScript
12
star
83

rex-bin_tools

Created by David Maloney via the GitHub Connector
Ruby
11
star
84

metasploit-vagrant-builders

Build tools to generate vagrant images used by metasploit-framework CI
Ruby
11
star
85

javascript-style-guide

10
star
86

vm-console-client-ruby

The UNOFFICIAL (but useful) Ruby gem for the Rapid7 InsightVM/Nexpose RESTful API
Ruby
10
star
87

insightcloudsec-actions

10
star
88

insightvm-api-examples

Created by Ivan Quintanilla via the GitHub Connector
Python
10
star
89

network_interface

C
10
star
90

attackerkb

Repo for creating-and-tracking issues related to AttackerKB
10
star
91

metasploit-model

Common code, such as validators and mixins, that are shared between ActiveModels in metasploit-framework and ActiveRecords in metasploit_data_models
Ruby
10
star
92

jenkinsci-appspider-plugin

Jenkins plugin that calls the AppSpider API
Java
9
star
93

insightappsec-azure-devops-extension

Rapid7 InsightAppSec Extension for Azure DevOps
TypeScript
9
star
94

recog-java

Recog java
Java
9
star
95

armor

Java
8
star
96

acs

Automatic Ciphertext Service
JavaScript
8
star
97

r7insight_node

node logging support for InsightOps
JavaScript
8
star
98

r7insight_js

Client-side JavaScript logging library for InsightOps
JavaScript
8
star
99

conqueso-client-java

Connect your Java services to Conqueso
Java
8
star
100

insightappsec-api-examples

Project intended to provide guides for InsightAppSec API examples and use cases
Python
8
star