Snowplow Analytics: Archive (@snowplow-archive)

Top repositories

1

factotum

A system to programmatically run data pipelines
Rust
220
star
2

schema-guru

JSONs -> JSON Schema
Scala
151
star
3

spark-example-project

A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scala
118
star
4

codeigniter-paypal-ipn

A CodeIgniter library for working with the PayPal IPN (Instant Payment Notification) service
PHP
111
star
5

spark-streaming-example-project

A Spark Streaming job reading events from Amazon Kinesis and writing event counts to DynamoDB
Scala
94
star
6

scalding-example-project

The Scalding WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scala
82
star
7

snowplow-docker

Docker images for Snowplow, Iglu and associated projects
Dockerfile
61
star
8

aws-lambda-scala-example-project

An AWS Lambda function in Scala reading events from Amazon Kinesis and writing event counts to DynamoDB
Scala
57
star
9

symfony2-paypal-ipn

A Symfony2 bundle for working with the PayPal IPN (Instant Payment Notification) service
PHP
56
star
10

sluice

A Ruby toolkit for cloud-friendly ETL
Ruby
38
star
11

google-cloud-dataflow-example-project

Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow
Scala
29
star
12

snowplow-tco-model

UNMAINTAINED. 2013
R
22
star
13

kinesis-example-scala-consumer

Example Scala/SBT event consumer for Amazon Kinesis
Scala
22
star
14

kinesis-example-scala-producer

Example Scala/SBT event producer for Amazon Kinesis
Scala
21
star
15

cloudfront-log-deserializer

A Hive Deserializer for CloudFront access logs (supports download distribution files only)
Java
17
star
16

snowplow.github.com

Legacy Snowplow website, switched off 25 April 2017
HTML
16
star
17

maxmind-geolite-update

A Python script to regularly update MaxMind's free geo databases
Python
15
star
18

icebucket

UNRELEASED. An opinionated framework for analytics-on-write on event streams using key-value storage
Scala
14
star
19

avalanche

Load testing for event analytics platforms (Snowplow, more coming soon)
Scala
13
star
20

dev-environment

Vagrant-based Snowplow development environment with Ansible playbooks to install common tools
Shell
12
star
21

factotum-server

Rust
10
star
22

r-data-science-environment

VM with complete R (RStudio) environment
Shell
9
star
23

prestashop-scala-client

Scala client for the PrestaShop Web Service (aka prestasac)
Scala
9
star
24

engineering-resources

7
star
25

huskimo

🐕 Extracts data from SaaS APIs and stores in Redshift
Scala
7
star
26

bigquery-loader-cli

UNMAINTAINED. Prototype CLI app for uploading Snowplow enriched events to BigQuery
Scala
5
star
27

snowplow-omniture-ingest

Ingests Omniture data (exported as log files) into SnowPlow for more involved analysis
5
star
28

infobright-ruby-loader

A data loader for Infobright, built in Ruby. Modelled on Infobright's own ParaFlex
Ruby
5
star
29

samza-scala-example-project

An Apache Samza stream processing job written in Scala
Scala
5
star
30

redash-java-sdk

Java
4
star
31

nsq-spark-example-project

A Spark job example for integrating NSQ with Spark
Scala
4
star
32

snowplow-gtm-custom-template

GTM Custom Template for the Snowplow JavaScript Tracker (v2)
Smarty
4
star
33

schema-ddl

MOVED. See:
Scala
4
star
34

dataform-data-models

Snowplow Incubator project for Dataform SQL data models for working with Snowplow data. Supports BigQuery only
JavaScript
4
star
35

looker-snowplow-web

A LookML block, that uses data from the Snowplow JavaScript tracker and Web Data Model derived tables and makes it available for exploration in Looker.
LookML
4
star
36

iglu-ruby-client

Ruby and JRuby client for Iglu
Ruby
3
star
37

neo4j-data-science-environment

VM with Neo4j installed
Shell
3
star
38

scala-serf-client

Minimal wrapper around https://github.com/tv2norge/java-serf-client
Scala
3
star
39

sp-js-assets

Contains all of the Snowplow JavaScript Tracker assets.
JavaScript
3
star
40

python-data-science-environment

Shell
3
star
41

snowplow-scala-project.g8

Shell
3
star
42

hive-example-udf

Java
3
star
43

makefile-rs

WIP Rust crate for parsing extremely simple Makefiles
Rust
2
star
44

right-to-be-forgotten-spark-job

Spark job for right to be forgotten
Scala
2
star
45

spark-data-science-environment

VM with Spark ready-to-go
Shell
2
star
46

piinguin

A micro-service to securely store pseudonomized PII data
Scala
2
star
47

graph-event-data-model

Schemas for nodes, relationships and events
2
star
48

event-manifest-cleaner

A Spark job that takes records straight from the failed enriched good directory and deletes exactly those from DynamoDB
Scala
2
star
49

snowplow-piinguin-relay

Snowplow Relay for feeding PII transformation events from Snowplow into Piinguin
Scala
2
star
50

scalacheck-schema

ScalaCheck generators for various Iglu-compatible schema formats
Scala
2
star
51

narcolepsy-scala

A Scala framework for building typesafe clients for RESTful web services
Scala
2
star
52

snowplow-clickhouse-loader

Scala
1
star
53

bintray-usage-alerter

Alerts PagerDuty when malicious downloaders target your Bintray files
Crystal
1
star
54

blob2stream

Reads records from cloud blob storage and writes to cloud stream
Scala
1
star
55

blix-javascript

Blix is a JavaScript library for adding surveys, coupons and flash messages to websites
JavaScript
1
star
56

iglu-objc-client

Objective-C client for Iglu
Objective-C
1
star
57

snowplow-cdc-source

Scala
1
star
58

vendor-matrix

1
star
59

snowplow-azure-data-lake-analytics-extractor

1
star
60

indicative-data-model

A data model for transforming Snowplow Staged Events for Indicative
1
star
61

snowplow-browser-plugin-simple-template

A simple template for creating and publishing a Browser Plugin for the Snowplow JavaScript Trackers
JavaScript
1
star