• Stars
    star
    925
  • Rank 49,378 (Top 1.0 %)
  • Language
    Python
  • Created about 12 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Powerful analytics and cohort library using Redis bitmaps

bitmapist

Build Status

NEW! Try out our new standalone bitmapist-server, which improves memory efficiency 443 times and makes your setup much cheaper to run (and more scaleable). It's fully compatiable with bitmapist that runs on Redis.

bitmapist: a powerful analytics library for Redis

This Python library makes it possible to implement real-time, highly scalable analytics that can answer following questions:

  • Has user 123 been online today? This week? This month?
  • Has user 123 performed action "X"?
  • How many users have been active this month? This hour?
  • How many unique users have performed action "X" this week?
  • How many % of users that were active last week are still active?
  • How many % of users that were active last month are still active this month?
  • What users performed action "X"?

This library is very easy to use and enables you to create your own reports easily.

Using Redis bitmaps you can store events for millions of users in a very little amount of memory (megabytes). You should be careful about using huge ids as this could require larger amounts of memory. Ids should be in range [0, 2^32).

Additionally bitmapist can generate cohort graphs that can do following:

  • Cohort over user retention
  • How many % of users that were active last [days, weeks, months] are still active?
  • How many % of users that performed action X also performed action Y (and this over time)
  • And a lot of other things!

If you want to read more about bitmaps please read following:

Installation

Can be installed very easily via:

$ pip install bitmapist

Ports

Examples

Setting things up:

from datetime import datetime, timedelta
from bitmapist import setup_redis, delete_all_events, mark_event,\
                      MonthEvents, WeekEvents, DayEvents, HourEvents,\
                      BitOpAnd, BitOpOr

now = datetime.utcnow()
last_month = datetime.utcnow() - timedelta(days=30)

Mark user 123 as active and has played a song:

mark_event('active', 123)
mark_event('song:played', 123)

Answer if user 123 has been active this month:

assert 123 in MonthEvents('active', now.year, now.month)
assert 123 in MonthEvents('song:played', now.year, now.month)
assert MonthEvents('active', now.year, now.month).has_events_marked() == True

How many users have been active this week?:

print(len(WeekEvents('active', now.year, now.isocalendar()[1])))

Iterate over all users active this week:

for uid in WeekEvents('active'):
    print(uid)

If you're interested in "current events", you can omit extra now.whatever arguments. Events will be populated with current time automatically.

For example, these two calls are equivalent:

MonthEvents('active') == MonthEvents('active', now.year, now.month)

Additionally, for the sake of uniformity, you can create an event from any datetime object with a from_date static method.

MonthEvents('active').from_date(now) == MonthEvents('active', now.year, now.month)

Get the list of these users (user ids):

print(list(WeekEvents('active', now.year, now.isocalendar()[1])))

There are special methods prev and next returning "sibling" events and allowing you to walk through events in time without any sophisticated iterators. A delta method allows you to "jump" forward or backward for more than one step. Uniform API allows you to use all types of base events (from hour to year) with the same code.

current_month = MonthEvents()
prev_month = current_month.prev()
next_month = current_month.next()
year_ago = current_month.delta(-12)

Every event object has period_start and period_end methods to find a time span of the event. This can be useful for caching values when the caching of "events in future" is not desirable:

ev = MonthEvent('active', dt)
if ev.period_end() < now:
    cache.set('active_users_<...>', len(ev))

As something new tracking hourly is disabled (to save memory!) To enable it as default do::

import bitmapist
bitmapist.TRACK_HOURLY = True

Additionally you can supply an extra argument to mark_event to bypass the default value::

mark_event('active', 123, track_hourly=False)

Unique events

Sometimes the date of the event makes little or no sense, for example, to filter out your premium accounts, or in A/B testing. There is a UniqueEvents model for this purpose. The model creates only one Redis key and doesn't depend on the date.

You can combine unique events with other types of events.

A/B testing example:

active_today = DailyEvents('active')
a = UniqueEvents('signup_form:classic')
b = UniqueEvents('signup_form:new')

print("Active users, signed up with classic form", len(active & a))
print("Active users, signed up with new form", len(active & b))

Generic filter example

def premium_up(uid):
    # called when user promoted to premium
    ...
    mark_unique('premium', uid)


def premium_down(uid):
    # called when user loses the premium status
    ...
    unmark_unique('premium', uid)

active_today = DailyEvents('active')
premium = UniqueEvents('premium')

# Add extra Karma for all premium users active today,
# just because today is a special day
for uid in premium & active_today:
    add_extra_karma(uid)

To get the best of two worlds you can mark unique event and regular bitmapist events at the same time.

def premium_up(uid):
    # called when user promoted to premium
    ...
    mark_event('premium', uid, track_unique=True)

Perform bit operations

How many users that have been active last month are still active this month?

active_2_months = BitOpAnd(
    MonthEvents('active', last_month.year, last_month.month),
    MonthEvents('active', now.year, now.month)
)
print(len(active_2_months))

# Is 123 active for 2 months?
assert 123 in active_2_months

Alternatively, you can use standard Python syntax for bitwise operations.

last_month_event = MonthEvents('active', last_month.year, last_month.month)
this_month_event = MonthEvents('active', now.year, now.month)
active_two_months = last_month_event & this_month_event

Operators &, |, ^ and ~ supported.

Work with nested bit operations (imagine what you can do with this ;-))!

active_2_months = BitOpAnd(
    BitOpAnd(
        MonthEvents('active', last_month.year, last_month.month),
        MonthEvents('active', now.year, now.month)
    ),
    MonthEvents('active', now.year, now.month)
)
print(len(active_2_months))
assert 123 in active_2_months

# Delete the temporary AND operation
active_2_months.delete()

Deleting

If you want to permanently remove marked events for any time period you can use the delete() method:

last_month_event = MonthEvents('active', last_month.year, last_month.month)
last_month_event.delete()

If you want to remove all bitmapist events use:

bitmapist.delete_all_events()

When using Bit Operations (ie BitOpAnd) you can (and probably should) delete the results unless you want them cached. There are different ways to go about this:

active_2_months = BitOpAnd(
    MonthEvents('active', last_month.year, last_month.month),
    MonthEvents('active', now.year, now.month)
)
# Delete the temporary AND operation
active_2_months.delete()

# delete all bit operations created in runtime up to this point
bitmapist.delete_runtime_bitop_keys()

# delete all bit operations (slow if you have many millions of keys in Redis)
bitmapist.delete_temporary_bitop_keys()

bitmapist cohort

With bitmapist cohort you can get a form and a table rendering of the data you keep in bitmapist. If this sounds confusing please look at Mixpanel.

Here's a simple example of how to generate a form and a rendering of the data you have inside bitmapist:

from bitmapist import cohort

html_form = cohort.render_html_form(
    action_url='/_Cohort',
    selections1=[ ('Are Active', 'user:active'), ],
    selections2=[ ('Task completed', 'task:complete'), ]
)
print(html_form)

dates_data = cohort.get_dates_data(select1='user:active',
                                   select2='task:complete',
                                   time_group='days')

html_data = cohort.render_html_data(dates_data,
                                    time_group='days')

print(html_data)

# All the arguments should come from the FORM element (html_form)
# but to make things more clear I have filled them in directly

This will render something similar to this:

bitmapist cohort screenshot

Contributing

Please see our guide here

Local Development

We use Poetry for dependency management & packaging. Please see here for setup instructions.

Once you have Poetry installed, you can run the following to install the dependencies in a virtual environment:

poetry install

Testing

To run our tests will need to ensure a local redis server is installed.

We use pytest to run unittests which you can run in a poetry shell with

poetry run pytest

Releasing new versions

  • Bump version in pyproject.toml
  • Update the CHANGELOG
  • Commit the changes with a commit message "Version X.X.X"
  • Tag the current commit with vX.X.X
  • Create a new release on GitHub named vX.X.X
  • GitHub Actions will publish the new version to PIP for you

Legal

Copyright: 2012 by Doist Ltd.

License: BSD

More Repositories

1

todoist-python

DEPRECATED The official Todoist Python API library
Python
548
star
2

RecyclerViewExtensions

RecyclerView made easier.
Java
480
star
3

typist

The mighty Tiptap-based rich-text editor that powers Doist products.
TypeScript
454
star
4

reactist

Open source React components made with ❤️ by Doist
TypeScript
234
star
5

redis_wrap

Implements a wrapper for Redis datatypes so they mimic the datatypes found in Python.
Python
146
star
6

todoist-api-python

A python wrapper for the Todoist REST API.
Python
143
star
7

redis_graph

Python graph database implemented on top of Redis
Python
131
star
8

ScriptCommunicator

Implementation of script communication that can be used to do long polling and JSONP communication
JavaScript
122
star
9

JobSchedulerCompat

Android library to schedule background jobs using JobScheduler, GCMNetworkManager or AlarmManager, depending on the context.
Java
120
star
10

JavaScript-memory-leak-checker

MemoryLeakChecker can check for data structure memory leaks in JavaScript
JavaScript
109
star
11

hash_ring

Implements consistent hashing in Python (using md5 as hashing function)
Python
105
star
12

ffs

Feature flags solution that is fast, lean, and open-source.
Kotlin
87
star
13

bitmapist-server

Memory-efficient standalone server for bitmapist library
Go
86
star
14

powerapp

A tool to extend the functionality of your Todoist account by integrating it with third-party applications
CSS
59
star
15

todoist-api-typescript

A TypeScript wrapper for the Todoist REST API.
TypeScript
52
star
16

python-timezones

A Python library that provides better selection of common timezones
Python
51
star
17

redis_simple_queue

Python queue implemented on top of Redis
Python
43
star
18

crash_hound

Monitor anything and get free notifications directly on your iPhone
Python
37
star
19

raven-sh

raven-sh is wrapper executing a command and sending its stdout/stderr to the Sentry server. Useful for cron jobs
Python
34
star
20

doistx-normalize

Kotlin Multiplatform (KMP) library for string unicode normalization
Kotlin
32
star
21

ICE

The Lightweight JavaScript library
JavaScript
31
star
22

java-kotlin-code-styles

IntelliJ IDEA code style settings for Doist's Java and Android projects.
Shell
30
star
23

sqs-workers

SQS Workers
Python
30
star
24

cronwrap

A cron job wrapper that wraps jobs and enables better error reproting and command timeouts.
Python
29
star
25

coffee-watcher

A script that can watch a directory and recompile CoffeeScript scripts if they change
CSS
26
star
26

AndroidMaterial

Material design compatibility library with samples.
Java
23
star
27

py_static_check

py_static_check can statically check your Python code for a lot of common errors
Python
22
star
28

TodoistPojos

Todoist POJOs for Java and Android applications.
Kotlin
21
star
29

bitmapist4

Next incarnation of bitmapist: powerful analytics and cohort library using Redis bitmaps
Python
21
star
30

unfurlist

unfurlist is a web service to unfurl urls
Go
21
star
31

ormist

Yet another Object-to-Redis mapper. Lightweight. Schema-agnostic.
Python
19
star
32

redis-bus

A Redis-based inter-service communication bus with autodiscovery and cache.
Python
19
star
33

css_image_concat

A script that can concat images into one image and create a CSS file.
Python
18
star
34

less-watcher

A script that can watch a directory and recompile .less scripts if they change.
CSS
17
star
35

commonmark-spannable-android

Java
17
star
36

resources

A fixture lifecycle management library
Python
17
star
37

avoid_disaster

Script backups easily to S3 using Python
Python
17
star
38

react-interpolate

A string interpolation component that formats and interpolates a template string in a safe way.
JavaScript
13
star
39

oauthist

OAuth2 framework with Redis backend to implement authorization and resource servers
Python
13
star
40

media-parser

oEmbed library for JavaScript with a few additional non-oEmbed sources
CoffeeScript
12
star
41

fixedlist

Fast performance fixed list for Redis
Python
12
star
42

changelog-gradle-plugin

Changelog Gradle plugin
Kotlin
10
star
43

DateTimePicker

Port of Android API 22 date and time pickers for 16+ APIs
Java
10
star
44

media-embed-server

oEmbed proxy in JavaScript that support a few additional non-oEmbed sources
CoffeeScript
10
star
45

AJS

The ultra lightweight JavaScript library
JavaScript
8
star
46

todoist-quickadd

Add Todoist QuickAdd anywhere on the web
TypeScript
8
star
47

IDBStorage

IndexedDB as key-value storage
JavaScript
8
star
48

todoist-integration-examples

This repository holds some code examples on how to build an integration for Todoist.
TypeScript
6
star
49

ElementStore

Standalone implementation of jQuery.data
JavaScript
6
star
50

ktlint-gradle-plugin

Kotlin
5
star
51

react-selector

React Selector is a React component that allows to filter and move item between two lists.
CoffeeScript
5
star
52

rqw

Redis Queue Worker
Go
5
star
53

python-cjson

Improved fork of https://pypi.python.org/pypi/python-cjson
C
5
star
54

watcher_lib

A library that can watch a directory and recompile files if they change. Can be used to build watcher scripts such as less-watcher or coffee-watcher.
CSS
5
star
55

cookiecutter-python-project

Use this cookiecutter template 🍪 to start every new Python project at Doist
Python
5
star
56

metric-watcher

UDP and HTTP servers to collect and display user-bound metrics
CoffeeScript
5
star
57

TodoistMarkup

Todoist markup for Java and Android applications.
Java
4
star
58

node.magic_dom

node.magic_dom: A DSL for building HTML in node.js
JavaScript
4
star
59

memcached_lock

Implements a distributed transaction using memcached or a memcached compatible storage.
Python
4
star
60

kotlin-warning-baseline

Kotlin
3
star
61

autoreload_prime

An auto reload module that should work with most servers (patched from Tornado)
Python
3
star
62

powerapp-pocket

A PowerApp service for Todoist -> pocket integration
Python
3
star
63

renovate-config

Shareable Renovate config templates - https://docs.renovatebot.com/config-presets/
3
star
64

version-name-gradle-plugin

Gradle plugin to use git tags as project version names
Kotlin
3
star
65

ui-extensions

TypeScript
3
star
66

gulpist

Static asset build tool for Doist
JavaScript
3
star
67

TodoistMimeUtils

Easily guessing mime types from filenames and streams in Java.
Java
3
star
68

agglog

Centralized log viewer which allows to see log tails of multiple connected servers
Go
3
star
69

PipelinesTemplates

Job and step templates for Azure Pipelines
Ruby
3
star
70

pdfsvc

html to pdf conversion service
Go
3
star
71

DoistEmojiMap

The json map to convert shortcuts to emoji
2
star
72

ScriptCommunicatorCROS

Implementation of script communication that can be used to do long polling (comet) and JSONP communication. Uses Cross-Origin Resource Sharing
JavaScript
2
star
73

detekt-rules

Detekt rules for Doist Kotlin projects
Kotlin
2
star
74

prettier-config

Prettier config for Doist's JavaScript and TypeScript projects
JavaScript
2
star
75

timethat

timeit on steroids, a module for benchmarking
Python
2
star
76

remark-application-links

remark plugin to detect and parse application links
TypeScript
2
star
77

twist-integration-examples

The repository for sample Twist integrations
Go
2
star
78

bentocss

CSS
2
star
79

TodoistMediaParser

Parses media links for more information like a thumbnail url, type and mime type.
Java
2
star
80

webpage-info

Return a webpage's title and favicon
CoffeeScript
2
star
81

todoist-google-sheets

UI Extension for Todoist that exports projects to Google Sheets
TypeScript
2
star
82

pagetest

Tool that outputs http(s) timings for certain page and its linked resources
Go
2
star
83

elasticache-redis-cost

Find AWS ElastiCache instance types that can fit existing Redis instances and see how much it will cost
Go
1
star
84

bitbuckethook

web server handling POST hooks from bitbucket.org.
Go
1
star
85

bitpusher

Command bitpusher forwards udp packets with userid/events to bitmapist instance
Go
1
star
86

optimize-images-action

GitHub Action that applies lossless image optimization.
Ruby
1
star
87

commonmark-ext-replacement

Java
1
star
88

kotlinx.time

Kotlin cross-platform API for dates, times, instants and durations
Kotlin
1
star
89

s3logger

Go
1
star
90

twist-post-action

GitHub action that posts a message to Twist thread.
1
star
91

android-translations-check

Android translations check Gradle plugin
Kotlin
1
star
92

redis-cluster-mock

Redis Cluster for testing purposes only.
Shell
1
star
93

cloudwatchlogs-relay

rsyslogd helper that forwards messages to CloudWatch Logs
Go
1
star
94

twistbot

Package providing abstractions to build http endpoints working as Twist bot integrations
Go
1
star
95

ChromaKey

A background for chroma key compositions of Android devices.
Java
1
star