• Stars
    star
    125
  • Rank 286,335 (Top 6 %)
  • Language
    Go
  • License
    Other
  • Created almost 9 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An easily-deployable, single-instance version of Snowplow

Snowplow Mini

Discourse posts Build Status Release License

An easily-deployable, single instance version of Snowplow that serves three use cases:

  1. Gives a Snowplow consumer (e.g. an analyst / data team / marketing team) a way to quickly understand what Snowplow "does" i.e. what you put in at one end and take out of the other
  2. Gives developers new to Snowplow an easy way to start with Snowplow and understand how the different pieces fit together
  3. Gives people running Snowplow a quick way to debug tracker updates (because they can)

Features

  • Data is tracked and processed in real time
  • Added Iglu Server to allow for custom schemas to be uploaded
  • Data is validated during processing
    • This is done using both our standard Iglu schemas and any custom ones that you have loaded into the Iglu Server
  • Data is loaded into Opensearch
    • Can be queried directly or through a Opensearch dashboard
    • Good and bad events are in distinct indexes
  • Create UI to indicate what is happening with each of the different subsystems (collector, enrich etc.), so as to provide developers a very indepth way of understanding how the different Snowplow subsystems work with one another

Note: Until version 0.15.0, Snowplow data was loaded to Elasticsearch 6.x in the Mini. However, a licensing change in Elasticsearch prevented us from upgrading it to more recent versions. To make sure we stay up to date with important security fixes, we've decided to replace Elasticsearch with Opensearch. Also, Kibana is replaced with Opensearch Dashboards. However, you may still encounter elasticsearch and kibana terms in the project.

Documentation

Cloud setup guides for AWS and GCP, in addition to a usage guide, are available at our docs website.

Local Quick Start

To run snowplow-mini on your local machine you will need to install the following pre-requisites:

Then you should be able stand up a snowplow-mini locally by then running:

$ git clone https://github.com/snowplow/snowplow-mini.git
  Cloning into 'snowplow-mini'...
$ cd snowplow-mini
$ vagrant up
  Bringing machine 'default' up with 'virtualbox' provider...

This will take a little time to complete, so grab yourself a β˜•οΈ and come back in a few minutes. See the troubleshooting section below if you encounter any errors.

Once complete, a Snowplow Collector will be running on http://localhost:8080 and the Snowplow Mini UI will be on http://localhost:2000/home.

To log in to the Snowplow Mini UI for the first time, follow the First time usage section within the documentation for the version of Snowplow Mini you have just created.

Once you are finished with Snowplow Mini locally, it is wise to stop the virtual machine:

$ vagrant halt
  ==> default: Attempting graceful shutdown of VM...

If you wish to tidy up all the resources, including deleting the virtual machine:

$ vagrant destroy
  default: Are you sure you want to destroy the 'default' VM? [y/N] y
  ==> default: Destroying VM and associated drives...

Vagrant Troubleshooting

Some advice on how to handle certain errors if you're trying to build this locally with Vagrant.

The box 'ubuntu/bionic64' could not be found or could not be accessed in the remote catalog.

Your Vagrant version is probably outdated. Use Vagrant 2.0.0+.

npm install results in enoent ENOENT: no such file or directory, open '/package.json'

This is caused by trying to use NFS. Comment the relevant lines in Vagrantfile.

Most likely this will happen on TASK [sp_mini_5_build_ui : Install npm packages based on package.json.] but see also: https://discourse.snowplowanalytics.com/t/snowplow-mini-local-vagrant/2930.

Topology

Snowplow Mini runs several distinct applications on the same box which are all linked by NSQ topics. In a production deployment each instance could be an Autoscaling Group and each NSQ topic would be a distinct Kinesis Stream.

  • Stream Collector:
    • Starts server listening on http://< sp mini public ip>/ which events can be sent to.
    • Sends "good" events to the RawEvents NSQ topic
    • Sends "bad" events to the BadEvents NSQ topic
  • Stream Enrich:
    • Reads events in from the RawEvents NSQ topic
    • Sends events which passed the enrichment process to the EnrichedEvents NSQ topic
    • Sends events which failed the enrichment process to the BadEvents NSQ topic
  • Elasticsearch Sink Good:
    • Reads events from the EnrichedEvents NSQ topic
    • Sends those events to the good Elasticsearch index
    • On failure to insert, writes errors to BadElasticsearchEvents NSQ topic
  • Elasticsearch Sink Bad:
    • Reads events from the BadEvents NSQ topic
    • Sends those events to the bad Elasticsearch index
    • On failure to insert, writes errors to BadElasticsearchEvents NSQ topic

These events can then be viewed in Kibana at http://< sp mini public ip>/kibana.

topology

Copyright and license

Snowplow Mini is copyright 2016-2022 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

More Repositories

1

snowplow

The leader in Next-Generation Customer Data Infrastructure
Scala
6,834
star
2

snowplow-javascript-tracker

Snowplow event tracker for client-side and server-side JavaScript. Add analytics to your websites, web apps and servers.
TypeScript
546
star
3

iglu

Iglu is a machine-readable, open-source schema repository for JSON Schema from the team at Snowplow
Shell
207
star
4

ansible-playbooks

Ansible playbooks to install common platforms and tools (e.g. JVM, Ruby, Postgres etc.)
Shell
178
star
5

iglu-central

Contains all JSON Schemas, Avros and Thrifts for Iglu Central
Shell
118
star
6

snowplow-android-tracker

Snowplow event tracker for Android. Add analytics to your Android apps and games
Kotlin
109
star
7

aws-lambda-nodejs-example-project

An AWS Lambda function in Node.js reading events from Amazon Kinesis and writing event counts to DynamoDB
JavaScript
102
star
8

scala-maxmind-iplookups

Scala client for MaxMind Geo-IP
Scala
86
star
9

sql-runner

Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
Go
81
star
10

snowplow-ios-tracker

Snowplow event tracker for Swift and Objective-C. Add analytics to your iOS, macOS, tvOS and watchOS apps and games
Swift
81
star
11

snowplow-web-data-model

SQL data model for working with Snowplow web data. Supports Redshift and Looker. Snowflake and BigQuery coming soon
LookML
61
star
12

dbt-snowplow-web

A fully incremental model, that transforms raw web event data generated by the Snowplow JavaScript tracker into a series of derived tables of varying levels of aggregation.
Shell
55
star
13

chrome-snowplow-inspector

Web Extension for debugging Snowplow pixels.
TypeScript
49
star
14

scala-forex

High-performance Scala library for performing exchange rate lookups and currency conversions
Scala
46
star
15

scala-weather

High-performance Scala library for looking up the weather
Scala
45
star
16

snowplow-python-tracker

Snowplow event tracker for Python. Add analytics to your Python and Django apps, webapps and games
Python
43
star
17

snowplow-s3-loader

Mirrors a Kinesis stream to Amazon S3 using the KCL
Scala
42
star
18

data-models

⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.
PLpgSQL
41
star
19

snowplow-php-tracker

Snowplow event tracker for PHP. Add analytics into your PHP apps and scripts
PHP
34
star
20

snowplow-rdb-loader

Stores Snowplow enriched events in Redshift, Snowflake and Databricks
Scala
31
star
21

snowplow-react-native-tracker

Snowplow event tracker for react-native apps
TypeScript
31
star
22

stream-collector

Collector for cloud-native web, mobile and event analytics, running on AWS and GCP
Scala
27
star
23

snowplow-golang-tracker

Snowplow event tracker for Golang. Add analytics to your Go apps and servers
Go
25
star
24

snowplow-nodejs-tracker

Snowplow event tracker for Node.js. Add analytics to your JavaScript apps, node-webkit projects and Node.js servers
TypeScript
24
star
25

snowplow-java-tracker

Snowplow event tracker for Java. Add analytics to your Java desktop and server apps, servlets and games. (See also: snowplow-android-tracker)
Java
24
star
26

snowplow-dotnet-tracker

Snowplow event tracker for .NET. Add analytics to your ASP.NET, C#, F# and Visual Basic apps, servers and games
C#
22
star
27

snowplow-ruby-tracker

Snowplow event tracker for Ruby. Add analytics to your Ruby and Rails apps and gems
Ruby
22
star
28

enrich

Snowplow Enrichment jobs and library
Scala
21
star
29

quickstart-examples

Examples of how to automate creating a Snowplow Community Edition pipeline
HCL
21
star
30

snowplow-python-analytics-sdk

Python SDK for working with Snowplow enriched events in Spark, AWS Lambda et al.
Python
21
star
31

snowplow-scala-analytics-sdk

Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.
Scala
20
star
32

dataflow-runner

Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
Go
19
star
33

snowplow-unity-tracker

Snowplow event tracker for Unity. Add analytics to your Unity games and apps
C#
16
star
34

snowbridge

For replicating streams across clouds, accounts and regions
Go
15
star
35

iglu-example-schema-registry

Example static schema registry for Iglu
15
star
36

dbt-snowplow-mobile

A fully incremental model, that transforms raw mobile event data generated by the Snowplow mobile trackers into a series of derived tables of varying levels of aggregation.
Shell
14
star
37

iglu-server

A RESTful schema registry
Scala
13
star
38

dbt-snowplow-utils

Snowplow utility functions to be used in conjunction with the snowplow-web dbt package.
PLpgSQL
13
star
39

kinesis-tee

Unix tee, but for Kinesis streams
Scala
12
star
40

snowplowanalytics.com

The Snowplow website
HTML
12
star
41

dbt-snowplow-fractribution

Snowplow Fractribution (marketing attribution) model for dbt
Python
11
star
42

snowplow-elasticsearch-loader

Writes Snowplow enriched events from Kinesis to Elasticsearch
Scala
11
star
43

documentation

Snowplow Documentation Website
JavaScript
10
star
44

dbt-snowplow-unified

A fully incremental model, that transforms raw web & mobile event data generated by the Snowplow JavaScript & mobile trackers into a series of derived tables of varying levels of aggregation.
Shell
10
star
45

snowplow-tracking-cli

Command-line app for tracking Snowplow events. Add analytics to your shell scripts and terminal sessions
Go
9
star
46

snowplow-gtm-server-side-client

A Google Tag Manager Server-side Client template for collecting events using the Snowplow JavaScript Tracker
Smarty
9
star
47

snowplow-cpp-tracker

Snowplow event tracker for C++. Add analytics to your C++ applications, games and servers
C++
9
star
48

dbt-snowplow-media-player

A fully incremental model, that transforms media player event data generated by the Snowplow JavaScript tracker into derived tables for easier querying
Shell
9
star
49

igluctl

A command-line tool for working with Iglu schema registries
Scala
8
star
50

snowplow-scala-tracker

Snowplow event tracker for Scala. Add analytics to your Scala, Akka and Play apps and servers
Scala
8
star
51

snowplow-looker-demo

LookML for the Snowplow Looker demo
LookML
7
star
52

snowplow-badrows

Scala
7
star
53

release-manager

Uploads zipfiles to Bintray and creates versions
Python
7
star
54

snowplow-rust-tracker

Rust
7
star
55

dbt-snowplow-ecommerce

A fully incremental model, that transforms raw ecommerce event data generated by the Snowplow JavaScript tracker into a series of derived tables representing various ecommerce data objects.
Shell
7
star
56

snowplow-arduino-tracker

Snowplow event tracker for Arduino. Add analytics to sketches on IP-connected Arduino boards
C++
7
star
57

dbt-snowplow-attribution

An incremental dbt package revolving around marketing attribution analysis
PLpgSQL
6
star
58

snowplow-flutter-tracker

Snowplow event tracker for Flutter apps
Dart
5
star
59

schema-ddl

ASTs and generators for producing various DDL and Schema formats
Scala
5
star
60

iglu-scala-client

Scala client for Iglu schema registry
Scala
5
star
61

snowplow-golang-analytics-sdk

Golang Analytics SDK for working with Snowplow enriched events in cloud functions and other Go applications.
Go
5
star
62

iab-spiders-and-robots-java-client

Java 8+ client library for the IAB and ABC International Spiders and Robots list
Java
5
star
63

snowplow-gtm-server-side-tag

A Google Tag Manager Server-side Tag template for sending events to a Snowplow Collector
Smarty
5
star
64

beam-enrich

Dataflow job reading tracked events from PubSub, validating and enriching them and writing them back to PubSub
Scala
4
star
65

snowplow-dotnet-analytics-sdk

C#
4
star
66

snowplow-gtm-server-side-amplitude-tag

A Google Tag Manager Server-side Amplitude Tag template for send events to the Amplitude HTTP API v2
Smarty
4
star
67

marketing-attribution-accelerator

A Snowplow accelerator which describes how to do marketing attribution with Snowplow
Shell
4
star
68

snowplow-lua-tracker

Snowplow event tracker for Lua. Add analytics to your Lua apps and Lua-scripted games
Lua
4
star
69

snowplow-aws-lambda-source

Sends Amazon S3 object operations into Snowplow, implemented as an AWS Lambda
4
star
70

snowplow-actionscript3-tracker

Snowplow event tracker for ActionScript 3.0. Add analytics to your Flash Player 9+, Flash Lite 4 and AIR games, apps and widgets
ActionScript
4
star
71

advanced-analytics-web-accelerator

Tutorial and visualisations showing how to instrument web analytics with Snowplow
Shell
3
star
72

dbt-snowplow-normalize

A dbt package to support modelling event data via split tables for use in downstream tools and systems.
Python
3
star
73

mobile-hybrid-apps-accelerator

Tutorial and demo apps showing how to instrument hybrid mobile apps with Snowplow tracking
Shell
3
star
74

composable-cdp-with-predictive-ml-modeling-accelerator

A composable CDP accelerator using Snowplow, Databricks & Hightouch
HTML
3
star
75

advanced-analytics-mobile-accelerator

Tutorial and visualisations showing how to instrument mobile analytics with Snowplow
Shell
2
star
76

scala-util

Reusable Scala code from Snowplow Analytics
Scala
2
star
77

snowplow-server-agent

Server monitoring agent compatible with Snowplow
1
star
78

iglu-scala-core

Core entities for working with Iglu in Scala
Scala
1
star
79

looker-snowplow-mobile

A LookML block, that uses data from the Snowplow JavaScript tracker and Mobile Data Model derived tables and makes it available for exploration in Looker.
LookML
1
star
80

stream-enrich

Application reading tracked events from Kafka/Kinesis/NSQ, validating and enriching them and writing them back to Kafka/Kinesis/NSQ
Scala
1
star
81

common-enrich

Library containing the logic to validate and enrich tracked events. Used by Stream Enrich and Beam Enrich
Scala
1
star
82

snowplow-full-demo-lookml

LookML for the full Snowplow Looker demo
LookML
1
star
83

snowplow-ecommerce-tracking-accelerator

Shell
1
star
84

iglu-javascript-client

Browser JavaScript client for Iglu
JavaScript
1
star