Sotera Defense, now Jacobs (@Sotera)

Top repositories

1

spark-distributed-louvain-modularity

Spark / graphX implementation of the distributed louvain modularity algorithm
Scala
304
star
2

distributed-graph-analytics

Distributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.
Java
168
star
3

correlation-approximation

Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets
Scala
91
star
4

mitie-trainer

Model Training tool for MITIE
JavaScript
77
star
5

newman

Quickly analyze and explore email with advanced analytics and visualization.
JavaScript
55
star
6

pst-extraction

PST extraction and analytic pipeline
Python
37
star
7

distributed-louvain-modularity

Community Detection and Compression Analytic for Big Graph Data
Java
37
star
8

graphene

JavaScript
24
star
9

zephyr

Zephyr is a big data, platform agnostic ETL API, with Hadoop MapReduce, Storm, and other big data bindings.
Java
21
star
10

watchman

Watchman: An open-source social-media event-detection system
JavaScript
20
star
11

aggregate-micro-paths

Infer movement patterns from large amounts of geo-temporal data in a cloud environment.
Python
14
star
12

track-communities

A series of analytics for creating networks from geo-temporal track data based on time/space co-occurrence. Includes UI for visualization of communities and tracks.
JavaScript
14
star
13

Datawake

Browser add-on and web server to support collection and analysis of web browsing data.
JavaScript
13
star
14

Datawake-Legacy

This project is superseded by the current Datawake project but is maintained here for existing users. Browser extension and backend services aimed at enhancing Internet search with domain specific knowledge, collaboration, and analysis.
JavaScript
10
star
15

DatawakeDepot

Loopback web application for administration of Datawake networks
JavaScript
9
star
16

high-betweenness-set-extraction

Approximate Betweenness Centrality computation for big graph data.
Java
8
star
17

rhipe-arima

An R/Hadoop Arima analytic using Rhipe to submit mapreduce jobs.
R
8
star
18

GEQE

Geo Event Quey by Example - Leverage geo-located temporal text data in order to identify similar locations or events.
Python
8
star
19

firmament

NodeJS script and Docker files to create MySQL/MongoDB backed AngularJS/Bootstrap web application
JavaScript
7
star
20

datawake-prefetch

Python
7
star
21

page-rank

Java
6
star
22

social-sandbox

Geo-temporal scraping of social media, unsupervised event detection
JavaScript
4
star
23

xdata-vm

Vagrant-Ubuntu VM serving as a platform for XDATA performer software integration
Ruby
4
star
24

xdata-nba

Tools to mine nba data
Python
3
star
25

leaf-compression

Java
3
star
26

DatawakeManager-WebApp

DatawakeManager Web Server
JavaScript
2
star
27

newman-vm

newman vm
Shell
2
star
28

interactive-graph-viewer

An R Shiny app for interactively viewing the results of the Louvain method for community detection.
JavaScript
2
star
29

triangle-counting

A port of the work at Sandia National Laboratories on approximate triangle counting via wedge sampling.
Scala
2
star
30

merlin-stack

Shell
2
star
31

hive-common-udf

A collection of common Apache Hive UDFs
Java
2
star
32

graphene-enron

JavaScript
2
star
33

graphene-walker

Java
2
star
34

go_watchman

github.com/watchman apps for which go is specifically well suited
Go
2
star
35

Rmmtsne

A native R implementation of multiple maps t-distributed stochastic neighbor embedding (mmtsne).
R
1
star
36

twitter-cacher

Twitter Scraper
Java
1
star
37

zephyr-sample-project

A sample project (or, rather, sample projects) to show various ways of using Zephyr - generally a good starting point for your own Zephyr implementations.
Java
1
star
38

vande

Java
1
star
39

sotera.github.io

CSS
1
star
40

DatawakeManager-Loopback

DatawakeManager Data Layer
JavaScript
1
star
41

newman-research

Tools to be evaluated prior to integration into Newman
Python
1
star
42

graphene-instagram

A version of Graphene that runs on scraped Instagram data.
Java
1
star
43

DatawakeFFPlugin

JMI based Datawake plugin for Firefox 38+
JavaScript
1
star
44

zephyr-contrib

Useful classes for functions outside the scope of Zephyr's ETL, but still used in many scenarios (generally with extensive dependencies that probably shouldn't be in the core API).
Java
1
star
45

DatawakeSuite

1
star
46

micropath-kml

For creating kml to visualize aggregate micro-path output.
Java
1
star