• Stars
    star
    1,437
  • Rank 32,762 (Top 0.7 %)
  • Language
    Ruby
  • Created over 15 years ago
  • Updated about 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ID3-based implementation of the ML Decision Tree algorithm

Decision Tree

A Ruby library which implements ID3 (information gain) algorithm for decision tree learning. Currently, continuous and discrete datasets can be learned.

  • Discrete model assumes unique labels & can be graphed and converted into a png for visual analysis
  • Continuous looks at all possible values for a variable and iteratively chooses the best threshold between all possible assignments. This results in a binary tree which is partitioned by the threshold at every step. (e.g. temperate > 20C)

Features

  • ID3 algorithms for continuous and discrete cases, with support for inconsistent datasets.
  • Graphviz component to visualize the learned tree
  • Support for multiple, and symbolic outputs and graphing of continuous trees.
  • Returns default value when no branches are suitable for input

Implementation

  • Ruleset is a class that trains an ID3Tree with 2/3 of the training data, converts it into set of rules and prunes the rules with the remaining 1/3 of the training data (in a C4.5 way).
  • Bagging is a bagging-based trainer (quite obvious), which trains 10 Ruleset trainers and when predicting chooses the best output based on voting.

Blog post with explanation & examples

Example

require 'decisiontree'

attributes = ['Temperature']
training = [
  [36.6, 'healthy'],
  [37, 'sick'],
  [38, 'sick'],
  [36.7, 'healthy'],
  [40, 'sick'],
  [50, 'really sick'],
]

# Instantiate the tree, and train it based on the data (set default to '1')
dec_tree = DecisionTree::ID3Tree.new(attributes, training, 'sick', :continuous)
dec_tree.train

test = [37, 'sick']
decision = dec_tree.predict(test)
puts "Predicted: #{decision} ... True decision: #{test.last}"

# => Predicted: sick ... True decision: sick

# Specify type ("discrete" or "continuous") in the training data
labels = ["hunger", "color"]
training = [
        [8, "red", "angry"],
        [6, "red", "angry"],
        [7, "red", "angry"],
        [7, "blue", "not angry"],
        [2, "red", "not angry"],
        [3, "blue", "not angry"],
        [2, "blue", "not angry"],
        [1, "red", "not angry"]
]

dec_tree = DecisionTree::ID3Tree.new(labels, training, "not angry", color: :discrete, hunger: :continuous)
dec_tree.train

test = [7, "red", "angry"]
decision = dec_tree.predict(test)
puts "Predicted: #{decision} ... True decision: #{test.last}"

# => Predicted: angry ... True decision: angry

License

The MIT License - Copyright (c) 2006 Ilya Grigorik

More Repositories

1

videospeed

HTML5 video speed controller (for Google Chrome)
JavaScript
3,812
star
2

ga-beacon

Google Analytics collector-as-a-service (using GA measurement protocol).
Go
3,536
star
3

gharchive.org

GH Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis.
Ruby
2,680
star
4

em-websocket

EventMachine based WebSocket server
Ruby
1,690
star
5

em-http-request

Asynchronous HTTP Client (EventMachine + Ruby)
Ruby
1,219
star
6

em-synchrony

Fiber aware EventMachine clients and convenience classes
Ruby
1,041
star
7

http-2

Pure Ruby implementation of HTTP/2 protocol
Ruby
894
star
8

bugspots

Implementation of simple bug prediction hotspot heuristic
Ruby
853
star
9

agent

Agent is an attempt at modelling Go-like concurrency, in Ruby
Ruby
729
star
10

vimgolf

Real Vim ninjas count every keystroke - do you?
Ruby
678
star
11

em-proxy

EventMachine Proxy DSL for writing high-performance transparent / intercepting proxies in Ruby
Ruby
662
star
12

node-spdyproxy

SPDY forwarding proxy - fast and secure
JavaScript
527
star
13

bloomfilter-rb

BloomFilter(s) in Ruby: Native counting filter + Redis counting/non-counting filters
C
472
star
14

async-rails

async Rails 3 stack demo
Ruby
465
star
15

istlsfastyet.com

Is TLS fast yet? Yes, yes it is.
HTML
422
star
16

hackernews-button

Embeddable Hacker News button + vote counter for your site
Go
415
star
17

http-client-hints

Ruby
402
star
18

spdy

SPDY is a protocol designed to reduce latency of web pages
Ruby
315
star
19

hpbn.co

High Performance Browser Networking (O'Reilly)
HTML
299
star
20

webp-detect

WebP with Accept negotiation
C++
242
star
21

zeroconf-router

Zero-config reverse proxies: let's get there!
Ruby
206
star
22

autoperf

Ruby driver for httperf - automated load and performance testing
Ruby
179
star
23

PubSubHubbub

Asynchronous PubSubHubbub Ruby Client
Ruby
175
star
24

heroku-buildpack-dart

Heroku buildpack for Dart
Shell
170
star
25

rack-speedtracer

SpeedTracer middleware for server side debugging
Ruby
155
star
26

textquery

Evaluate any text against a collection of match rules
Ruby
145
star
27

tokyo-recipes

Lean & mean Tokyo Cabinet recipes (with Lua)
Lua
143
star
28

slowgrowl

Surface slow code paths in your Rails 3 app via Growl
Ruby
116
star
29

mneme

Mneme is an HTTP web-service for recording and identifying previously seen records - aka, duplicate detection.
Ruby
108
star
30

RRRDTool

Round robin database pattern via Redis sorted sets
Ruby
79
star
31

pregel

Single-node implementation of Google's Pregel framework for graph processing.
Ruby
74
star
32

gmetric

Pure Ruby interface for generating Ganglia gmetric packets
Ruby
69
star
33

rack-aggregate

Rack response-time statistics aggregator middleware
Ruby
67
star
34

em-jack

An Evented Beanstalk Client
Ruby
64
star
35

rb-pagerank

Code from RailsConf '09 pres: Building Mini Google in Ruby
Ruby
54
star
36

closure-sprockets

Sprockets processor for Google's Closure tools
Python
54
star
37

netinfo-monitor

Displays network quality as reported by Network Information API.
JavaScript
48
star
38

shopify-core-web-vitals

This embedded app provides a report on how real-world Google Chrome users experience the Shopify-powered storefront, as captured by the Chrome UX Report, and enables the site owner to benchmark their site against a custom list of competitors.
Ruby
48
star
39

libsnappy

Snappy, a fast compressor/decompressor (courtesy of Google)
Ruby
46
star
40

hydra5

Load-balanced (multi-headed) SOCKS5 proxy
Ruby
42
star
41

zdevice

ZDevice is a Ruby DSL for assembling ZeroMQ routing devices, with support for the ZDCF configuration syntax
Ruby
42
star
42

ruby2lolz

Ruby to Lolcode translator, kthnxbai.
Ruby
38
star
43

bmr-wordcount

Browser Map-Reduce: distributed word count example
Ruby
33
star
44

resource-hints

Moved to...
JavaScript
32
star
45

gitter

XML history generator for CodeSwarm
32
star
46

em-socksify

Transparent proxy support for any EventMachine protocol
Ruby
31
star
47

em-handlersocket

EventMachine HandlerSocket MySQL plugin for direct read/write of InnoDB tables
Ruby
29
star
48

canicrawl

Hosted robots.txt permissions verifier
Go
23
star
49

udacity-webperf

JavaScript
17
star
50

omnipipe

web pipes for your browser's omnibar!
Ruby
12
star
51

issue-tracker

W3C webperf issue tracker
JavaScript
11
star
52

contextual

runtime contextual HTML autoescaper
Ruby
10
star
53

presentations

Slides, notes, code examples from some of the bigger conferences & talks.
9
star
54

libgeohash

Ruby FFI wrapper for libgeohash
Ruby
7
star
55

performance-observer

JavaScript
7
star
56

ImageQuote

Convert text quotes to images
Ruby
7
star
57

resourcehints.info

HTML
2
star
58

igrigorik

1
star