• Stars
    star
    148
  • Rank 241,598 (Top 5 %)
  • Language
    Clojure
  • Created about 14 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A machine learning library for Clojure built on top of Weka and friends

clj-ml

A machine learning library for Clojure built on top of Weka and friends

Installation

In order to install the library you must first install Leiningen. You should also download the Weka 3.6.2 jar from the official weka homepage. If maven complains about not finding weka, follow its instructions to install the jar manually.

To install from source

  • git clone the project
  • $ lein deps
  • $ lein compile
  • $ lein compile-java
  • $ lein uberjar

Installing from Clojars

[clj-ml "0.0.3-SNAPSHOT"]

Installing from Maven

(add Clojars repository)

<dependency>
   <groupId>clj-ml</groupId>
   <artifactId>clj-ml</artifactId>
   <version>0.0.3-SNAPSHOT</version>
 </dependency>

Supported algorithms

  • Filters
  • supervised discretize
  • unsupervised discretize
  • supervised nominal to binary
  • unsupervised nominal to binary
  • Classifiers
  • C4.5 (J4.8)
  • naive Bayes
  • multilayer perceptron
  • Clusterers
  • k-means

Usage

  • I/O of data
    REPL> (use 'clj-ml.io)

    REPL> ; Loading data from an ARFF file, XRFF and CSV are also supported
    REPL> (def ds (load-instances :arff "file:///Applications/weka-3-6-2/data/iris.arff"))

    REPL> ; Saving data in a different format
    REPL> (save-instances :csv "file:///Users/antonio.garrote/Desktop/iris.csv"  ds)
  • Working with datasets
    REPL>(use 'clj-ml.data)

    REPL>; Defining a dataset
    REPL>(def ds (make-dataset "name" [:length :width {:kind [:good :bad]}] [ [12 34 :good] [24 53 :bad] ]))
    REPL>ds

    #<ClojureInstances @relation name

    @attribute length numeric
    @attribute width numeric
    @attribute kind {good,bad}

    @data
    12,34,good
    24,53,bad>

    REPL>; Using datasets like sequences
    REPL>(dataset-seq ds)

    (#<Instance 12,34,good> #<Instance 24,53,bad>)

    REPL>; Transforming instances  into maps or vectors
    REPL>(instance-to-map (first (dataset-seq ds)))

    {:kind :good, :width 34.0, :length 12.0}

    REPL>(instance-to-vector (dataset-at ds 0))
    [12.0 34.0 :good]
  • Filtering datasets
    REPL>(us 'clj-ml.filters)

    REPL>(def ds (load-instances :arff "file:///Applications/weka-3-6-2/data/iris.arff"))

    REPL>; Discretizing a numeric attribute using an unsupervised filter
    REPL>(def  discretize (make-filter :unsupervised-discretize {:dataset *ds* :attributes [0 2]}))

    REPL>(def filtered-ds (filter-process discretize ds))
  • Using classifiers
    REPL>(use 'clj-ml.classifiers)

    REPL>; Building a classifier using a  C4.5 decission tree
    REPL>(def classifier (make-classifier :decission-tree :c45))

    REPL>; We set the class attribute for the loaded dataset
    REPL>(dataset-set-class ds 4)

    REPL>; Training the classifier
    REPL>(classifier-train classifier ds)


     #<J48 J48 pruned tree
     ------------------

     petalwidth <= 0.6: Iris-setosa (50.0)
     petalwidth > 0.6
     |	petalwidth <= 1.7
     |	|   petallength <= 4.9: Iris-versicolor (48.0/1.0)
     |	|   petallength > 4.9
     |	|   |	petalwidth <= 1.5: Iris-virginica (3.0)
     |	|   |	petalwidth > 1.5: Iris-versicolor (3.0/1.0)
     |	petalwidth > 1.7: Iris-virginica (46.0/1.0)

     Number of Leaves  :		5

     Size of the tree :	9


    REPL>; We evaluate the classifier using a test dataset
    REPL>; last parameter should be a different test dataset, here we are using the same
    REPL>(def evaluation   (classifier-evaluate classifier  :dataset ds ds))

     === Confusion Matrix ===

       a	 b  c	<-- classified as
      50	 0  0 |	 a = Iris-setosa
       0 49  1 |	 b = Iris-versicolor
       0	 2 48 |	 c = Iris-virginica

     === Summary ===

     Correctly Classified Instances	   147		     98	     %
     Incorrectly Classified Instances	     3		      2	     %
     Kappa statistic			     0.97
     Mean absolute error			     0.0233
     Root mean squared error		     0.108
     Relative absolute error		     5.2482 %
     Root relative squared error		    22.9089 %
     Total Number of Instances		   150

    REPL>(:kappa evaluation)

     0.97

    REPL>(:root-mean-squared-error e)

     0.10799370769526968

    REPL>(:precision e)

     {:Iris-setosa 1.0, :Iris-versicolor 0.9607843137254902, :Iris-virginica
      0.9795918367346939}

    REPL>; The classifier can also be evaluated using cross-validation
    REPL>(classifier-evaluate classifier :cross-validation ds 10)

     === Confusion Matrix ===

       a	 b  c	<-- classified as
      49	 1  0 |	 a = Iris-setosa
       0 47  3 |	 b = Iris-versicolor
       0	 4 46 |	 c = Iris-virginica

     === Summary ===

     Correctly Classified Instances	   142		     94.6667 %
     Incorrectly Classified Instances	     8		      5.3333 %
     Kappa statistic			     0.92
     Mean absolute error			     0.0452
     Root mean squared error		     0.1892
     Relative absolute error		    10.1707 %
     Root relative squared error		    40.1278 %
     Total Number of Instances		   150

    REPL>; A trained classifier can be used to classify new instances
    REPL>(def to-classify (make-instance ds
                                                      {:class :Iris-versicolor,
                                                      :petalwidth 0.2,
                                                      :petallength 1.4,
                                                      :sepalwidth 3.5,
                                                      :sepallength 5.1}))
    REPL>(classifier-classify classifier to-classify)

     0.0

    REPL>(classifier-label to-classify)

     #<Instance 5.1,3.5,1.4,0.2,Iris-setosa>


    REPL>; The classifiers can be saved and restored later
    REPL>(use 'clj-ml.utils)

    REPL>(serialize-to-file classifier
    REPL> "/Users/antonio.garrote/Desktop/classifier.bin")
  • Using clusterers
    REPL>(use 'clj-ml.clusterers)

    REPL> ; we build a clusterer using k-means and three clusters
    REPL> (def kmeans (make-clusterer :k-means {:number-clusters 3}))

    REPL> ; we need to remove the class from the dataset to
    REPL> ; use this clustering algorithm
    REPL> (dataset-remove-class ds)

    REPL> ; we build the clusters
    REPL> (clusterer-build kmeans ds)
    REPL> kmeans

      #<SimpleKMeans
      kMeans
      ======

      Number of iterations: 3
      Within cluster sum of squared errors: 7.817456892309574
      Missing values globally replaced with mean/mode

      Cluster centroids:
                                                Cluster#
      Attribute                Full Data               0               1               2
                                   (150)            (50)            (50)            (50)
      ==================================================================================
      sepallength                 5.8433           5.936           5.006           6.588
      sepalwidth                   3.054            2.77           3.418           2.974
      petallength                 3.7587            4.26           1.464           5.552
      petalwidth                  1.1987           1.326           0.244           2.026
      class                  Iris-setosa Iris-versicolor     Iris-setosa  Iris-virginica

License

MIT License

More Repositories

1

rdfstore-js

JS RDF store with SPARQL support
JavaScript
561
star
2

jobim

Actors library for Clojure
Clojure
113
star
3

palermo

Palermo a job processing system inspired by resque for the JVM and RabbitMQ
JavaScript
69
star
4

json-ld-macros

Declarative transformation of JSON APIs into JSON-LD
JavaScript
68
star
5

clj-tesseract

Clojure wrapper for the Tesseract OCR software
C++
55
star
6

clj-plaza

clojure's rdf framework
JavaScript
49
star
7

apricot-soup

HTML manipulation library for Clojure
Clojure
39
star
8

lein-javac

a javac plugin for the leiningen clojure's building system
Clojure
36
star
9

semantic-ko

declarative web interfaces using semantic data
JavaScript
32
star
10

typed.rb

Gradual type checker for Ruby
Ruby
30
star
11

egearmand-server

erlang implementation of gearman server
Erlang
28
star
12

clj-haml

a haml like template HTML library for clojure
Clojure
27
star
13

levanzo

Building hypermedia APIs in Clojure using Hydra and Triple Fragment Patterns
Clojure
26
star
14

clj-r2rml

Clojure implementation of the W3C proposal R2RML: RDB to RDF mapping language
Objective-J
25
star
15

clj-control

control abstractions for the Clojure programming language
Clojure
19
star
16

NodeJS-WebID-demo

JavaScript
14
star
17

clj-s4

Clojure library for the Yahoo S4 distributed streams framework
Clojure
13
star
18

mahout-vis

Visualization utils for Mahout and the Clojure REPL
Clojure
12
star
19

hiccup-rdfa

RDFa function for hiccup templates that can be used in a Clojure Ring app
Clojure
12
star
20

rdf-raptor-node-js

Node.js extension for the Raptor RDF library
C++
11
star
21

clojure_kilim

Clojure/Kilim integration
Java
11
star
22

stardog-rb

Ruby bindings for Stardog HTTP API
Ruby
10
star
23

erlfaye

comet ++ websockets ++ erlang
Erlang
10
star
24

tokengame

modelling distributed systems in Clojure
Clojure
9
star
25

mondello

Mondello is a free graphical client for Docker, Docker Machine, Docker Compose for OSX
JavaScript
9
star
26

clojure-grizzly-trial

a sample clojure-grizzly adapter based on jruby-grizzly
Clojure
8
star
27

cronparser.erl

Crontab parsing and date computation functionality for erlang
Erlang
8
star
28

Chinarro

Chinarro is a version of Pocho who likes parenthesis
Clojure
7
star
29

micrograph.js

graph data layer for JS client apps
JavaScript
7
star
30

sparql-clj-v8-integration

an experiment embedding V8 into Clojure
JavaScript
6
star
31

semantic_rest

toolkit for the creation of semantic restful web services
JavaScript
5
star
32

serenade.js

JS music library based on "Music for Geeks and Nerds"
JavaScript
5
star
33

conf_rails_hispana_2009

a sample project for the rails conf hispana 2009
Ruby
5
star
34

clj-tuples

Support for tuples and pattern matching in clojure
Clojure
4
star
35

clj-plaza-rabbit

Yet another RabbitMQ Clojure wrapper
Clojure
4
star
36

geektalk

a demo JS app using RDF to link data from different APIs
JavaScript
4
star
37

exceptions_begone_dispatcher

Pushing live exceptions (begone!) from one application to your websockets enabled browser
Erlang
3
star
38

CortesAbiertas---Abredatos-2010

JavaScript
3
star
39

cassandra-wrapper

a Clojure 1.2 wrapper for Cassandra Thrift API aiming current version of Cassandra
3
star
40

social.rdf

Javascript platform to aggregate and publish social data associated to a WebID
JavaScript
3
star
41

linkedvis

Grammar of Graphics style Visualization Toolkit for RDF
JavaScript
3
star
42

Plaza

Non object oriented, non SQL based, non MVC centered web framework for building web applications consisting of sets of semantic RESTful web services.
Erlang
3
star
43

pivotal_tracker_iphone

an iphone client for pivotal tracker (http://www.pivotaltracker.com)
2
star
44

grel

Query and manipulation library of ruby objects in a graph
Ruby
2
star
45

clj-raml

RAML library in Clojure
Clojure
2
star
46

clj-LOGO

a clojure interpreter for the LOGO programming language that runs on Google App Engine
Clojure
2
star
47

coding-test

the coding test
1
star
48

dbwidgets

widgets+dashboard demo app using DBPedia info
JavaScript
1
star