• This repository has been archived on 13/Aug/2024
  • Stars
    star
    9
  • Rank 1,933,530 (Top 39 %)
  • Language
    Shell
  • Created over 9 years ago
  • Updated over 9 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

VM with complete R (RStudio) environment

More Repositories

1

factotum

A system to programmatically run data pipelines
Rust
220
star
2

schema-guru

JSONs -> JSON Schema
Scala
151
star
3

spark-example-project

A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scala
118
star
4

codeigniter-paypal-ipn

A CodeIgniter library for working with the PayPal IPN (Instant Payment Notification) service
PHP
111
star
5

spark-streaming-example-project

A Spark Streaming job reading events from Amazon Kinesis and writing event counts to DynamoDB
Scala
94
star
6

scalding-example-project

The Scalding WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scala
82
star
7

snowplow-docker

Docker images for Snowplow, Iglu and associated projects
Dockerfile
61
star
8

aws-lambda-scala-example-project

An AWS Lambda function in Scala reading events from Amazon Kinesis and writing event counts to DynamoDB
Scala
57
star
9

symfony2-paypal-ipn

A Symfony2 bundle for working with the PayPal IPN (Instant Payment Notification) service
PHP
56
star
10

sluice

A Ruby toolkit for cloud-friendly ETL
Ruby
38
star
11

google-cloud-dataflow-example-project

Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow
Scala
29
star
12

snowplow-tco-model

UNMAINTAINED. 2013
R
22
star
13

kinesis-example-scala-consumer

Example Scala/SBT event consumer for Amazon Kinesis
Scala
22
star
14

kinesis-example-scala-producer

Example Scala/SBT event producer for Amazon Kinesis
Scala
21
star
15

cloudfront-log-deserializer

A Hive Deserializer for CloudFront access logs (supports download distribution files only)
Java
17
star
16

snowplow.github.com

Legacy Snowplow website, switched off 25 April 2017
HTML
16
star
17

maxmind-geolite-update

A Python script to regularly update MaxMind's free geo databases
Python
15
star
18

icebucket

UNRELEASED. An opinionated framework for analytics-on-write on event streams using key-value storage
Scala
14
star
19

avalanche

Load testing for event analytics platforms (Snowplow, more coming soon)
Scala
13
star
20

dev-environment

Vagrant-based Snowplow development environment with Ansible playbooks to install common tools
Shell
12
star
21

factotum-server

Rust
10
star
22

prestashop-scala-client

Scala client for the PrestaShop Web Service (aka prestasac)
Scala
9
star
23

engineering-resources

7
star
24

huskimo

🐕 Extracts data from SaaS APIs and stores in Redshift
Scala
7
star
25

bigquery-loader-cli

UNMAINTAINED. Prototype CLI app for uploading Snowplow enriched events to BigQuery
Scala
5
star
26

snowplow-omniture-ingest

Ingests Omniture data (exported as log files) into SnowPlow for more involved analysis
5
star
27

infobright-ruby-loader

A data loader for Infobright, built in Ruby. Modelled on Infobright's own ParaFlex
Ruby
5
star
28

samza-scala-example-project

An Apache Samza stream processing job written in Scala
Scala
5
star
29

redash-java-sdk

Java
4
star
30

nsq-spark-example-project

A Spark job example for integrating NSQ with Spark
Scala
4
star
31

snowplow-gtm-custom-template

GTM Custom Template for the Snowplow JavaScript Tracker (v2)
Smarty
4
star
32

schema-ddl

MOVED. See:
Scala
4
star
33

dataform-data-models

Snowplow Incubator project for Dataform SQL data models for working with Snowplow data. Supports BigQuery only
JavaScript
4
star
34

looker-snowplow-web

A LookML block, that uses data from the Snowplow JavaScript tracker and Web Data Model derived tables and makes it available for exploration in Looker.
LookML
4
star
35

iglu-ruby-client

Ruby and JRuby client for Iglu
Ruby
3
star
36

neo4j-data-science-environment

VM with Neo4j installed
Shell
3
star
37

scala-serf-client

Minimal wrapper around https://github.com/tv2norge/java-serf-client
Scala
3
star
38

sp-js-assets

Contains all of the Snowplow JavaScript Tracker assets.
JavaScript
3
star
39

python-data-science-environment

Shell
3
star
40

snowplow-scala-project.g8

Shell
3
star
41

hive-example-udf

Java
3
star
42

makefile-rs

WIP Rust crate for parsing extremely simple Makefiles
Rust
2
star
43

right-to-be-forgotten-spark-job

Spark job for right to be forgotten
Scala
2
star
44

spark-data-science-environment

VM with Spark ready-to-go
Shell
2
star
45

piinguin

A micro-service to securely store pseudonomized PII data
Scala
2
star
46

graph-event-data-model

Schemas for nodes, relationships and events
2
star
47

event-manifest-cleaner

A Spark job that takes records straight from the failed enriched good directory and deletes exactly those from DynamoDB
Scala
2
star
48

snowplow-piinguin-relay

Snowplow Relay for feeding PII transformation events from Snowplow into Piinguin
Scala
2
star
49

scalacheck-schema

ScalaCheck generators for various Iglu-compatible schema formats
Scala
2
star
50

narcolepsy-scala

A Scala framework for building typesafe clients for RESTful web services
Scala
2
star
51

snowplow-clickhouse-loader

Scala
1
star
52

bintray-usage-alerter

Alerts PagerDuty when malicious downloaders target your Bintray files
Crystal
1
star
53

blob2stream

Reads records from cloud blob storage and writes to cloud stream
Scala
1
star
54

blix-javascript

Blix is a JavaScript library for adding surveys, coupons and flash messages to websites
JavaScript
1
star
55

iglu-objc-client

Objective-C client for Iglu
Objective-C
1
star
56

snowplow-cdc-source

Scala
1
star
57

vendor-matrix

1
star
58

snowplow-azure-data-lake-analytics-extractor

1
star
59

indicative-data-model

A data model for transforming Snowplow Staged Events for Indicative
1
star
60

snowplow-browser-plugin-simple-template

A simple template for creating and publishing a Browser Plugin for the Snowplow JavaScript Trackers
JavaScript
1
star