• Stars
    star
    416
  • Rank 104,068 (Top 3 %)
  • Language
    Go
  • License
    Other
  • Created almost 11 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Go datastructures.

Go Data Structures

by Tim Henderson ([email protected])

Copyright 2013, Licensed under the BSD 3-clause license. Please reach out to me directly if you require another licensing option. I am willing to work with you.

Purpose

To collect many important data structures for usage in go programs. Golang's standard library lacks many useful and important structures. This library attempts to fill the gap. I have implemented data-structure's as I have needed them. If there is a missing structure or even just a missing (or incorrect) method open an issue, send a pull request, or send an email patch.

The library also provides generic types to allow the user to swap out various data structures transparently. The interfaces provide operation for adding, removing, retrieving objects from collections as well as iterating over the collection using functional iterators.

The tree sub-package provides a variety of generic tree traversals. The tree traversals and other iterators in the package use a functional iteration technique detailed on my blog.

I hope you find my library useful. If you are using it drop me a line I would love to hear about it.

Current Collection

GoDoc

Lists

Doubly Linked List linked.LinkedList

A simple an extensible doubly linked list. It is Equatable Sortable, and Hashable as are the Nodes.

Array List list.List

Similar to a Java ArrayList or a Python or Ruby "list". There is a version (called Sortable) which integrates with the "sort" package from the standard library.

Sorted Array List list.Sorted

Keeps the ArrayList in sorted order for you.

Sorted Set set.SortedSet

Built on top of *list.Sorted, it provides basic set operations. With set.SortedSet you don't have to write code re-implementing sets with the map[type] datatype. Supports: intersection, union, set difference and overlap tests.

Map Set set.MapSet

Construct a types.Map from any types.Set.

Set Map set.SetMap

Construct a set from any types.Map.

Unique Deque linked.UniqueDeque

A double ended queue that only allows unique items inside. Constructed from a doubly linked list and a linear hash table.

Fixed Size Lists

Both list.List and list.Sorted have alternative constructors which make them fixed size. This prevents them from growing beyond a certain size bound and is useful for implementing other data structures on top of them.

Serialization

list.List, list.Sorted, and set.SortedSet all can be serialized provided their contents can be serialized. They are therefore suitable for being sent over the wire. See this example for how to use the serialization.

Heaps and Priority Queues

Binary Heap heap/Heap

This is a binary heap for usage as a priority queue. The priorities are given to items in the queue on insertion and cannot be changed after insertion. It can be used as both a min heap and a max heap.

Unique Priority Queue heap/UniquePQ

A priority queue which only allows unique entries.

Trees

AVL Tree tree/avl.AvlTree

An AVL Tree is a height balanced binary search tree. Insertion and retrieval are both O(log(n)) where n is the number items in the tree.

Immutable AVL Tree tree/avl.ImmutableAvlTree

This version of the classic is immutable and should be thread safe due to immutability. However, there is a performance hit:

BenchmarkAvlTree           10000            166657 ns/op
BenchmarkImmutableAvlTree   5000            333709 ns/op

Ternary Search Trie trie.TST

A ternary search trie is a symbol table specialized to byte strings. Ternary Search Tries (TSTs) are a particularly fast version of the more common R-Way Trie. They utilize less memory allowing them to store more data while still retaining all of the flexibility of the R-Way Trie. TSTs can be used to build a suffix tree for full text string indexing by storing every suffix of each string in addition to the string. However, even without storing all of the suffixes it is still a great structure for flexible prefix searches. For instance, TSTs can be used to implement extremely fast auto-complete functionality.

B+Tree tree/bptree.BpTree

A B+Tree is a general symbol table usually used for database indices. This implementation is not currently thread safe. However, unlike many B+Trees it fully supports duplicate keys making it suitable for use as a Multi-Map. There is also a variant which has unique keys, bptree.BpMap. B+Trees are storted and efficient to iterate over making them ideal choices for storing a large amount of data in sorted order. For storing a very large amount of data please utilize the fs2 version, fs2/bptree. fs2 utilizes memory mapped files in order to allow you to store more data than your computer has RAM.

Hash Tables

Separate Chaining Hash Table hashtable.Hash

See hashtable/hashtable.go. An implementation of the classic hash table with separate chaining to handle collisions.

Linear Hash Table with AVL Tree Buckets hashtable.LinearHash

See hashtables/linhash.go. An implementation of Linear Hashing, a technique usually used for secondary storage hash tables. Often employed by databases and file systems for hash indices. This version is mostly instructional see the accompanying blog post. If you want a disk backed version check out my file-structures repository. See the linhash directory.

Exceptions, Errors, and Testing

Errors errors

By default, most errors in Go programs to not track where they were created. Many programmers quickly discover they need to have stack traces associated with their errors. This is a light weight package which adds stack traces to errors. It also provides a very very simple logging function that reports where in your code your printed out the log. This is not a full featured logging solution but rather a replacement for using fmt.Printf when debugging.

Test Support test

The test package provides two minimal assertions and a way to get random strings and data. It also seeds the math/rand number generator. I consider this to the bare minimum of what is often needed when testing go code particularly data-structures. Since this package seeks to be entirely self contained with no dependencies no external testing package is used. This package is slowly being improved to encompass more common functionality between the different tests.

Exceptions as a Library exc

The exc package provides support for exceptions. They work very similarly to the way unchecked exceptions work in Java. They are built on top of the built-in panic and recover functions. See the README in the package for more information or checkout the documentation. They should play nice with the usual way of handling errors in Go and provide an easy way to create public APIs which return errors rather than throwing these non-standard exceptions.

Benchmarks

Note: these benchmarsk are fairly old and probably not easy to understand. Look at the relative difference not the absolute numbers as they are misleading. Each benchmark does many operations per "test" which makes it difficult to compare these numbers to numbers found elsewhere.

Benchmarks Put + Remove

$ go test -v -bench '.*' \
>   github.com/timtadh/data-structures/hashtable
>   github.com/timtadh/data-structures/tree/...
>   github.com/timtadh/data-structures/trie

BenchmarkGoMap             50000             30051 ns/op
BenchmarkMLHash            20000             78840 ns/op
BenchmarkHash              20000             81012 ns/op
BenchmarkTST               10000            149985 ns/op
BenchmarkBpTree            10000            185134 ns/op
BenchmarkAvlTree           10000            193069 ns/op
BenchmarkImmutableAvlTree   5000            367602 ns/op
BenchmarkLHash              1000           2743693 ns/op

Benchmarks Put

BenchmarkGoMap            100000             22036 ns/op
BenchmarkMLHash            50000             52104 ns/op
BenchmarkHash              50000             53426 ns/op
BenchmarkTST               50000             69852 ns/op
BenchmarkBpTree            20000             76124 ns/op
BenchmarkAvlTree           10000            142104 ns/op
BenchmarkImmutableAvlTree  10000            302196 ns/op
BenchmarkLHash              1000           1739710 ns/op

The performance of the in memory linear hash (MLHash) is slightly improved since the blog post do to the usage of an AVL Tree tree/avltree.go instead of an unbalanced binary search tree.

Related Projects

  • fs2 Memory mapped datastructures. A B+Tree, a list, and a platform for implementing more.

  • file-structures The previous version of fs2 of disk based file-structures. Also includes a linear virtual hashing implementation.

More Repositories

1

zhang-shasha

Tree edit distance using the Zhang Shasha algorithm
Python
425
star
2

lexmachine

Lex machinary for go.
Go
401
star
3

fs2

B+ Tree - List - File Structures 2 - Memory Mapped File Structures for Go
Go
400
star
4

file-structures

File Structures (B+Tree, BTree) for Go
Go
70
star
5

parsemis

The ParSeMiS project (Parallel and Sequential Graph Mining Suite) searches for frequent, interesting substructures in graph databases. https://www2.cs.fau.de/EN/research/zold/ParSeMiS/index.html
Java
43
star
6

pyflwor

A Query Langauge and System for Python Objects
Python
26
star
7

jpdg

generate program dependence graphs for java programs.
Java
25
star
8

dynagrok

Go
23
star
9

getline

A cross platform, BSD Licensed, Python library for getting keyboard input from the console. Support for history and line editing.
Python
21
star
10

netutils

A little library for turning TCP connections into go channels.
Go
11
star
11

dot

Stream parsing of the graphviz dot language for go.
Go
11
star
12

fuzzbuzz

An Attribute Grammar Fuzzer
Python
10
star
13

slang

Simple Language : A very simple computer language for teaching purposes.
Python
10
star
14

dot_tools

A library to parse and generate the graphviz dot langauge.
Python
10
star
15

goiso

A wrapper around libbliss for graph isomorphism testing and canonical labeling.
C++
8
star
16

goplay

Random Small Go Programs
Go
8
star
17

regex-machines

Regular Expression Processing Virtual Machines.
Go
7
star
18

pyflow

A compiler written in python for dataflow analysis experimentation.
Python
7
star
19

swork

A shell enviroment manager. Easily manage the shell enviroment (variables etc.) for working on various projects. Stands for start work.
Python
7
star
20

gobuild-fork

An obsolete build to for the Go Language (obsoleted by Go 1, the first stable go release).
Go
7
star
21

graple

Go
7
star
22

tcel

A type checked functional language.
Go
7
star
23

getopt

Yet Another getopt Library for Go. This one works like the python module getopt.
Go
6
star
24

GoAST

Generate a serialized AST of a Go Package as an ordered labeled tree.
Go
6
star
25

tdpp

Top Down Predictive Parser Written in Go
Go
6
star
26

peano

A evaluator for Peano Arithmetic as given in "The Incompleteness Phnomenon" by M. Goldstern and H. Judah. ISBN 1-56881-093-8.
Python
6
star
27

queued

Queued is a simple queue deamon; allows clients to have persistent TCP connections.
Go
5
star
28

combos

Parser combinators for Go. Easy parsing of LL(k) grammars.
Go
5
star
29

betterast

An AST for Python.
Python
4
star
30

jist

Python
4
star
31

diplomacy

Python
4
star
32

tst

Ternary Search Trie
Python
4
star
33

optutils

Small, composable utilities for writing great command line tools.
Python
4
star
34

gramstat

Produce statistics about the operational distribution of a grammar.
Python
3
star
35

crypt_framework

A generic encrypted communication framework for python.
Python
3
star
36

gitfutz

Futzing with data from git using pygit2
Python
2
star
37

pytdpp

A top down predictive parser implementation for Python. Can be used with any LL(1) grammar.
Python
2
star
38

ops-example

A small, very simple example of how to use Ansible and Fabric.
Nginx
2
star
39

twik

Tim's Web Interaction Kit
Python
2
star
40

GoIR

Produces a MIR (Medium level Intermediate Representation) for Go based on the AST produced by github.com/timtadh/GoAST. The MIR is intended to be useful starting point for program analysis/optimization.
Go
2
star
41

regrax

Go
2
star
42

passmash

The Site Specific Password Munger
Python
1
star
43

prettyme

Takes [markdown|html] file (singular) adds optional [CSS, title] and outputs an html page.
Python
1
star
44

hackthology

repo which generates timtadh.github.io
HTML
1
star
45

gopkgr

A packaging and installation system for Go.
Go
1
star
46

lex

A lexer written in Go.
Go
1
star
47

msbench

Python
1
star
48

lambda

A lambda calculus interpreter
Python
1
star
49

timtadh.github.io

website
HTML
1
star
50

warc-extractor

extract a random sample of HTML files from WARCs
Java
1
star
51

imposterous

a simple markdown posterous poster
Python
1
star
52

expr-caculator

A simple calculator
Go
1
star
53

PyOhio2011

PyOhio 2011 Example Code for "Python, Parsing, and You"
Python
1
star
54

calloc

An unsafe Go interface to the C memory allocator.
Go
1
star