• Stars
    star
    2
  • Language
    Elixir
  • License
    MIT License
  • Created about 8 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

stream count distinct element estimation

Spacesaving

Simple algorithm to estimate distinct elements in an unbounded stream using bounded space. The estimate is the upper bound on the element's actual count.

Docs on hex

Usage

Add it to you mix.exs deps

{:spacesaving, "~> 0.0.2"}

Init with 3 spaces, so we track 3 elements

import Spacesaving

state = init(3)

Push some elements

state = state
|> push(:foo) |> push(:foo) |> push(:foo) |> push(:foo)
|> push(:bar) |> push(:bar) |> push(:bar)
|> push(:baz) |> push(:baz)
|> push(:buzz)

Get the top k elements

top(state, 2) # This will be [foo: 4, bar: 3]
top(state, 3) # This will be [foo: 4, bar: 3, buzz: 3], so the inaccuracy starts to come into play when an element is kicked out, and the estimate is the upper bound

Merge two states

left  = init(4) |> push(:foo) |> push(:bar)
right = init(4) |> push(:foo) |> push(:baz)

merge(left, right)
|> top(3) # Would be [foo: 2, bar: 1, baz: 1]

References

Original Paper

More Repositories

1

vaporwave

qB)
CSS
42
star
2

minnowswithmachineguns

A utility for arming (creating) many minnows (digital ocean instances) to attack (load test) targets (web applications). Based on beeswithmachineguns
Python
33
star
3

exquery

elixir html parser
Elixir
31
star
4

exshape

shapefiles
Elixir
20
star
5

exgpg

gpg interface
Elixir
18
star
6

jqish

jq ish thing for grabbing stuff from jsony objects
Erlang
6
star
7

yams

what's for dinner? a yam
Elixir
6
star
8

localitydispatcher

a genstage dispatcher for dispatching events based on node locality
Elixir
5
star
9

corpcrawl-dead

Looks at the SEC EDGAR filings to pull out Corporate and Subsidiary relationships
Python
4
star
10

census-explorer

explorer for US census datasets
CSS
3
star
11

corpcrawl

thing for extracting corporate subsidiary info from the SEC EDGAR database
Elixir
3
star
12

caspaxos

toy program
Elixir
3
star
13

exkad

kademlia
Elixir
3
star
14

reproject

elixir nif for proj4, inspiration from greenelephantlabs/proj4erl
Elixir
2
star
15

reddit_earthporn

A Google map of reddit's r/earthporn subreddit
PHP
2
star
16

meatspace-links

m e a t s p a c e c h a t l i n k s
JavaScript
2
star
17

usic

μμμμμμμμμμμμμμμμμμμμμμμμμμμμ
JavaScript
2
star
18

vulnpub

backbone
Elixir
2
star
19

transform

quick and dirty prototype
JavaScript
2
star
20

wubwub

simple web crawler with node. experimenting with streams.
JavaScript
1
star
21

cs410

school project
Java
1
star
22

etlien

(づ。◕‿‿◕。)づ
Elixir
1
star
23

anesthetic

r a c i n g
C
1
star
24

edgarex

elixir interface for fetching SEC filings from EDGAR
Elixir
1
star
25

joinery

elixir playground
Elixir
1
star
26

ubc-catalog

ubc courses
JavaScript
1
star
27

perf

ᔑ•ﺪ͟͠•ᔐ ᔑ•ﺪ͟͠•ᔐ ᔑ•ﺪ͟͠•ᔐ
Erlang
1
star
28

cs304

cs304 project 4 the h8ers
JavaScript
1
star
29

sweg.revisit

520
Python
1
star