• Stars
    star
    1,397
  • Rank 33,640 (Top 0.7 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created almost 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This repository is a list of machine learning libraries written in Rust. It's a compilation of GitHub repositories, blogs, books, movies, discussions, papers, etc. 🦀

arml

This repository is a list of machine learning libraries written in Rust. It's a compilation of GitHub repositories, blogs, books, movies, discussions, papers. This repository is targeted at people who are thinking of migrating from Python. 🦀🐍

It is divided into several basic library and algorithm categories. And it also contains libraries that are no longer maintained and small libraries. It has commented on the helpful parts of the code. It also commented on good libraries within each category.

We can find a better way to use Rust for Machine Learning.

ToC

Support Tools

Jupyter Notebook

evcxr can be handled as Jupyter Kernel or REPL. It is helpful for learning and validation.

Graph Plot

It might want to try plotters for now.

ASCII line graph:

Examples:

Vector

Most things use ndarray or std::vec.

Also, look at nalgebra. When the size of the matrix is known, it is valid. See also: ndarray vs nalgebra - reddit

Dataframe

It might want to try polars for now. datafusion looks good too.

Image Processing

It might want to try image-rs for now. Algorithms such as linear transformations are implemented in other libraries as well.

Natural Language Processing (preprocessing)

  • google-research/deduplicate-text-datasets - This repository contains code to deduplicate language model datasets as descrbed in the paper "Deduplicating Training Data Makes Language Models Better" by Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch and Nicholas Carlini. This repository contains both the ExactSubstr deduplication implementation (written in Rust) along with the scripts we used in the paper to perform deduplication and inspect the results (written in Python). In an upcoming update, we will add files to reproduce the NearDup-deduplicated versions of the C4, RealNews, LM1B, and Wiki-40B-en datasets.
  • pemistahl/lingua-rs - 👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike
  • usamec/cntk-rs - Wrapper around Microsoft CNTK library
  • stickeritis/sticker - A LSTM/Transformer/dilated convolution sequence labeler
  • tensordot/syntaxdot - Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.
  • christophertrml/rs-natural - Natural Language Processing for Rust
  • bminixhofer/nnsplit - Semantic text segmentation. For sentence boundary detection, compound splitting and more.
  • greyblake/whatlang-rs - Natural language detection library for Rust.
  • finalfusion/finalfrontier - Context-sensitive word embeddings with subwords. In Rust.
  • bminixhofer/nlprule - A fast, low-resource Natural Language Processing and Error Correction library written in Rust.
  • rth/vtext - Simple NLP in Rust with Python bindings
  • tamuhey/tokenizations - Robust and Fast tokenizations alignment library for Rust and Python
  • vgel/treebender - A HDPSG-inspired symbolic natural language parser written in Rust
  • reinfer/blingfire-rs - Rust wrapper for the BlingFire tokenization library
  • CurrySoftware/rust-stemmers - Common stop words in a variety of languages
  • cmccomb/rust-stop-words - Common stop words in a variety of languages
  • Freyskeyd/nlp - Rust-nlp is a library to use Natural Language Processing algorithm with RUST
  • Daniel-Liu-c0deb0t/uwu - fastest text uwuifier in the west

Graphical Modeling

Interface & Pipeline & AutoML

Workflow

GPU

Comprehensive (like sklearn)

All libraries support the following algorithms.

  • Linear Regression
  • Logistic Regression
  • K-Means Clustering
  • Neural Networks
  • Gaussian Process Regression
  • Support Vector Machines
  • kGaussian Mixture Models
  • Naive Bayes Classifiers
  • DBSCAN
  • k-Nearest Neighbor Classifiers
  • Principal Component Analysis
  • Decision Tree
  • Support Vector Machines
  • Naive Bayes
  • Elastic Net

It might want to try smartcore or linfa for now.

Comprehensive (Statistics)

  • statrs-dev/statrs - Statistical computation library for Rust
  • rust-ndarray/ndarray-stats - Statistical routines for ndarray
  • Axect/Peroxide - Rust numeric library with R, MATLAB & Python syntax
    • Linear Algebra, Functional Programming, Automatic Differentiation, Numerical Analysis, Statistics, Special functions, Plotting, Dataframe
  • tarcieri/micromath - Embedded Rust arithmetic, 2D/3D vector, and statistics library

Gradient Boosting

Deep Neural Network

Tensorflow bindings and PyTorch bindings are the most common. tch-rs also has torch vision, which is useful.

Graph Model

  • Synerise/cleora - Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
  • Pardoxa/net_ensembles - Rust library for random graph ensembles

Natural Language Processing (model)

Recommendation

  • PersiaML/PERSIA - High performance distributed framework for training deep learning recommendation models based on PyTorch.
  • jackgerrits/vowpalwabbit-rs - 🦀🐇 Rusty VowpalWabbit
  • outbrain/fwumious_wabbit - Fwumious Wabbit, fast on-line machine learning toolkit written in Rust
  • hja22/rucommender - Rust implementation of user-based collaborative filtering
  • maciejkula/sbr-rs - Deep recommender systems for Rust
  • chrisvittal/quackin - A recommender systems framework for Rust
  • snd/onmf - fast rust implementation of online nonnegative matrix factorization as laid out in the paper "detect and track latent factors with online nonnegative matrix factorization"
  • rhysnewell/nymph - Non-Negative Matrix Factorization in Rust

Information Retrieval

Full Text Search

Nearest Neighbor Search

  • Enet4/faiss-rs - Rust language bindings for Faiss
  • rust-cv/hnsw - HNSW ANN from the paper "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs"
  • hora-search/hora - 🚀 efficient approximate nearest neighbor search algorithm collections library, which implemented with Rust 🦀. horasearch.com
  • InstantDomain/instant-distance - Fast approximate nearest neighbor searching in Rust, based on HNSW index
  • lerouxrgd/ngt-rs - Rust wrappers for NGT approximate nearest neighbor search
  • granne/granne - Graph-based Approximate Nearest Neighbor Search
  • u1roh/kd-tree - k-dimensional tree in Rust. Fast, simple, and easy to use.
  • qdrant/qdrant - Qdrant - vector similarity search engine with extended filtering support
  • rust-cv/hwt - Hamming Weight Tree from the paper "Online Nearest Neighbor Search in Hamming Space"
  • fulara/kdtree-rust - kdtree implementation for rust.
  • mrhooray/kdtree-rs - K-dimensional tree in Rust for fast geospatial indexing and lookup
  • kornelski/vpsearch - C library for finding nearest (most similar) element in a set
  • petabi/petal-neighbors - Nearest neighbor search algorithms including a ball tree and a vantage point tree.
  • ritchie46/lsh-rs - Locality Sensitive Hashing in Rust with Python bindings
  • kampersanda/mih-rs - Rust implementation of multi-index hashing for neighbor searches on 64-bit codes in the Hamming space

Reinforcement Learning

Supervised Learning Model

Unsupervised Learning & Clustering Model

Statistical Model

  • Redpoll/changepoint - Includes the following change point detection algorithms: Bocpd -- Online Bayesian Change Point Detection Reference. BocpdTruncated -- Same as Bocpd but truncated the run-length distribution when those lengths are unlikely.
  • krfricke/arima - ARIMA modelling for Rust
  • Daingun/automatica - Automatic Control Systems Library
  • rbagd/rust-linearkalman - Kalman filtering and smoothing in Rust
  • sanity/pair_adjacent_violators - An implementation of the Pair Adjacent Violators algorithm for isotonic regression in Rust

Evolutionary Algorithm

Reference

Nearby Projects

Blogs

Introduction

Tutorial

Apply

Case study

Discussion

Books

Movie

PodCast

Paper

  • End-to-end NLP Pipelines in Rust, Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), pages 20–25 Virtual Conference, 2020/11/19, Guillaume Becquin

How to contribute

Please just update the README.md.

If you update this README.md, CI will be executed automatically. And the website will also be updated.

Thanks

Thanks for all the projects.

https://github.com/vaaaaanquish/Awesome-Rust-MachineLearning

More Repositories

1

cloudia

Tools to easily create a word cloud
Python
115
star
2

lightgbm-rs

LightGBM Rust binding
Rust
56
star
3

wasm_lindera_example

rust + lindera + webassembly + next.js + typescriptで形態素解析するサンプル
TypeScript
39
star
4

twitter_manager

my twitter management app
Jupyter Notebook
25
star
5

nishika_akutagawa_2nd_prize

nishika akutagawa compedition 2nd prize : https://www.nishika.com/competitions/1/summary
Python
25
star
6

rust-machine-learning-api-example

Example of Rust API for Machine Learning
Rust
17
star
7

dajare-python

駄洒落データを検索するコマンド、スクレイピングのためのツールキット
Jupyter Notebook
16
star
8

docker-UTH-BERT

docker for UTH-BERT: https://ai-health.m.u-tokyo.ac.jp/uth-bert
Dockerfile
14
star
9

dajare-detector

Japanese joke detection
Python
13
star
10

rust-ann-search-example

Image search example by approximate nearest-neighbor library In Rust
Rust
12
star
11

pyoklock

pyoklock : python cli digital clock.
Python
10
star
12

tch-rs-pretrain-example-docker

Docker for PyTorch rust bindings `tch`. Example of pretrain model.
Rust
9
star
13

gokart-pipeliner

gokart pipeline project
Python
9
star
14

xontrib-readable-traceback

xonsh readable traceback
Xonsh
8
star
15

rc_files

rcfiles
Vim Script
5
star
16

daughter-first-program

The first program written by my beloved daughter.
4
star
17

luigi_completion

completion for luigi on bash, zsh
Python
4
star
18

label-propagation-rs

Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). LGC and CAMLP have been implemented.
Rust
4
star
19

gokart_redshells_thunderbolt_example

m3 OSS Collaboration
Jupyter Notebook
2
star
20

gokart-examples

gokart examples for m3 techbook 2
Jupyter Notebook
2
star
21

vaaaaanquish

vaaaaanquish
2
star
22

select-command-using-ptk

select-command-using-ptk
Python
2
star
23

rust-text-analysis

rust-text-analysis
Rust
1
star
24

dlib_detection_python_script

dlib_detection_python_script
Python
1
star