• Stars
    star
    200
  • Rank 188,704 (Top 4 %)
  • Language
    Clojure
  • License
    Eclipse Public Li...
  • Created about 3 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Clojure machine learning library

Clojars Projectcljdoc badge

  • v0.3: Gitpod ready-to-code v0.2.2
  • latest snapshot: Gitpod ready-to-code latest-snapshot
  • latest snapshot: Binder

scicloj.ml

A idiomatic Clojure machine learning library.

Main features:

  • Harmonized and idiomatic use of various classification, regression and unsupervised models
  • Supports creation of machine learning pipelines as-data
  • Includes easy-to-use, sophisticated cross-validations of pipelines
  • Includes most important data transformation for data preprocessing
  • Experiment tracking can be added by the user via a callback mechanism
  • Open architecture to allow to plugin any potential ML model, even in non-JVM languages, including deep learning
  • Based on well established Clojure/Java Data Science libraries

Quickstart

Dependencies:

{:deps
 {scicloj/scicloj.ml {:mvn/version "0.3"}}}

Code:

(require '[scicloj.ml.core :as ml]
         '[scicloj.ml.metamorph :as mm]
         '[scicloj.ml.dataset :as ds])

;; read train and test datasets
(def titanic-train
  (ds/dataset "https://github.com/scicloj/metamorph-examples/raw/main/data/titanic/train.csv" {:key-fn keyword :parser-fn :string}))

(def titanic-test
  (-> "https://github.com/scicloj/metamorph-examples/raw/main/data/titanic/test.csv"
      (ds/dataset {:key-fn keyword :parser-fn :string})
      (ds/add-column :Survived [""] :cycle)))

;; construct pipeline function including Logistic Regression model
(def pipe-fn
  (ml/pipeline
   (mm/select-columns [:Survived :Pclass ])
   (mm/add-column :Survived (fn [ds] (map #(case % "1" "yes" "0" "no" nil "") (:Survived ds))))
   (mm/categorical->number [:Survived :Pclass])
   (mm/set-inference-target :Survived)
   {:metamorph/id :model}
   (mm/model {:model-type :smile.classification/logistic-regression})))

;;  execute pipeline with train data including model in mode :fit
(def trained-ctx
  (pipe-fn {:metamorph/data titanic-train
            :metamorph/mode :fit}))

;; execute pipeline in mode :transform with test data which will do a prediction 
(def test-ctx
  (pipe-fn
   (assoc trained-ctx
          :metamorph/data titanic-test
          :metamorph/mode :transform)))

;; extract prediction from pipeline function result
(-> test-ctx :metamorph/data
    (ds/column-values->categorical :Survived))
    
;; => #tech.v3.dataset.column<string>[418]
;;    :Survived
;;    [no, no, yes, no, no, no, no, yes, no, no, no, no, no, yes, no, yes, yes, no, no, no...]   
                

Community

For support use Clojurians on Zulip:

Scicloj.ml on Zulip

or on Clojurians Slack:

Scicloj.ml on Slack

Documentation

Full documentation is here as userguides

API documentation: https://scicloj.github.io/scicloj.ml

Reference to projects scicloj.ml is using/based on:

This library itself is a shim, not containing any functions. The code is present in the following repositories, and the functions get re-exported in scicloj.ml in a small number of namespaces for user convenience.

Scicloj.ml organises the existing code in 3 namespaces, as following:

namespace scicloj.ml.core

Functions are re-exported from:

  • scicloj.metamorph.ml.*
  • scicloj.metamorph.core

namespace scicloj.ml.dataset

All functions in this ns take a dataset as first argument. The functions are re-exported from:

  • tabecloth.api
  • tech.v3.dataset.modelling
  • tech.v3.dataset.column-filters

namespace scicloj.ml.metamorph

All functions in this ns take a metamorph context as first argument, so can directly be used in metamorph pipelines. The functions are re-exported from:

  • tablecloth.pipeline
  • tech.v3.libs.smile.metamorph
  • scicloj.metamorph.ml
  • tech.v3.dataset.metamorph

In case you are already familar with any of the original namespaces, they can of course be used directly as well:

(require '[tablecloth.api :as tc])
(tc/add-column ...)

Plugins

scicloj.ml can be easely extended by plugins, which contribute models or other algorithms. By now the following plugins exist:

More Repositories

1

tablecloth

Dataset manipulation library built on the top of tech.ml.dataset
HTML
260
star
2

notespace

using your namespace as a notebook
Clojure
146
star
3

clojisr

Clojure speaks statistics - a bridge between Clojure to R
HTML
142
star
4

clay

A tiny Clojure tool for dynamic workflow of data visualization and literate programming
CSS
100
star
5

clojure-data-cookbook

A book about how to do common data manipulation, analysis, and visualization tasks in Clojure
Clojure
74
star
6

wolframite

An interface between Clojure and Wolfram Language (the language of Mathematica)
Mathematica
39
star
7

scicloj-data-science-handbook

Clojure data science handbook - journal style examples of data science
Clojure
34
star
8

metamorph

Context pipelines
Clojure
31
star
9

clj-djl

clojure wrap for deep java library(DJL.ai)
Clojure
31
star
10

sklearn-clj

Plugin to use sklearn models in metamorph.ml
Clojure
28
star
11

viz.clj

A Clojure data visualization library
Clojure
27
star
12

noj

A clojure framework for data science
Clojure
25
star
13

scicloj.ml-tutorials

Tutorials for scicloj.ml
Clojure
23
star
14

kindly

A small library for defining how different kinds of things should be rendered
Clojure
20
star
15

wadogo

scales for clojure
Clojure
18
star
16

metamorph.ml

Machine learning functions for metamorph based on machine learning pipelines
Clojure
17
star
17

tablecloth.time

Tools for the processing and manipulation of time-series data in Clojure.
Clojure
16
star
18

tutorials

A repo for hosting Clojure data science tutorials created by the community
Jupyter Notebook
15
star
19

nov2021-workshops

The November 2021 pre-conference workshops of re:Clojure
Clojure
14
star
20

scicloj.ml.smile

A Smile plugin for scicloj.ml
Clojure
8
star
21

notespace-sicmutils-example

An example of using Notespace to write Sicmutils notes
Clojure
8
star
22

clojure-data-scrapbook

community-contributed examples for the emerging Clojure data stack
Clojure
8
star
23

scicloj.ml.xgboost

A xgboost plugin for scicloj.ml
Clojure
7
star
24

scicloj.ml.tribuo

Use Tribuo ML model in metamorph.ml
Clojure
7
star
25

cjlpy

Using Python from Clojure
Clojure
6
star
26

clojisr-examples

examples of using clojisr
Clojure
6
star
27

ml-study

A repo for the ml study group
HTML
5
star
28

fastr-examples

Experimenting with Clojure-FastR interop
Clojure
5
star
29

visual-tools-experiments

Experiments of the visual tools group
HTML
5
star
30

scicloj.ml.top2vec

Use top2vec model from Clojure
Clojure
4
star
31

clay.el

Emacs bindings for the Clojure Clay tool
Emacs Lisp
4
star
32

scicloj.github.io

The Scicloj website
HTML
4
star
33

docker-hub

docker containers
Dockerfile
4
star
34

python-data-science-handbook-in-clojure

A Clojure port of the code in the Python Data Science Handbook
Clojure
4
star
35

cmdstan-clj

Using the Stan statistical modelling language from Clojure using the CmdStan CLI
Clojure
4
star
36

kind-clerk

An adapter for the Clerk tool to support the Kindly conventions
Clojure
3
star
37

scicloj.old.replaced-20220218

Source of the old Scicloj website (replaced by scicloj.github.io, 2022-02-18)
HTML
3
star
38

kindly-noted

A common space for notes following the Kindly convention
Clojure
3
star
39

tempfiles

a small Clojure library for managing temporary files
Clojure
3
star
40

kind-portal

An adapter for the Portal tool to support the Kindly conventions
Clojure
3
star
41

sicmutils-drafts

Drafts of notes about Sicmutils
Clojure
2
star
42

gandiva-examples

Trying Gandiva from Clojure
Clojure
2
star
43

kaggle-kernels

Implementing kernels for some kaggle competetions
Clojure
2
star
44

tensorflow-study

studying tensorflow and its use from Clojure
2
star
45

ds4clj

data science for clojure devs course
2
star
46

metamorph-examples

Clojure
2
star
47

TensorStandardInterface

An effort towards an idiomatic Clojure interface for Tensors (N-Dimensional Arrays).
Clojure
2
star
48

scicloj.github.com.archived-20220218

Scicloj website - an old version
HTML
1
star
49

workshops

1
star
50

sci-fu

The main repo for the Scicloj Foundations study group
Jupyter Notebook
1
star
51

scicloj.ml.clj-djl

clj-djl models for metamorph.ml and scicloj.ml
Clojure
1
star
52

clojisr-rengine

Just a wrapper to the newest REngine source code
Shell
1
star
53

datarium-CSV

datasets from the datarium R package, converted to CSV format
R
1
star
54

workplan

The SciCloj workplan -- a living organizing document
1
star
55

kindly-advice

a small library to advise Clojure data visualization and notebook tools how to display forms and values, following the kindly convention
Clojure
1
star
56

note-to-test

generating tests automatically from Clojure notes
Clojure
1
star