• Stars
    star
    937
  • Rank 48,766 (Top 1.0 %)
  • Language
    R
  • License
    GNU Lesser Genera...
  • Created about 6 years ago
  • Updated about 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

mlr3: Machine Learning in R - next generation

mlr3

Package website: release | dev

Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.

r-cmd-check DOI CRAN Status StackOverflow Mattermost

Resources (for users and developers)

Installation

Install the last release from CRAN:

install.packages("mlr3")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3")

If you want to get started with mlr3, we recommend installing the mlr3verse meta-package which installs mlr3 and some of the most important extension packages:

install.packages("mlr3verse")

Example

Constructing Learners and Tasks

library(mlr3)

# create learning task
task_penguins = as_task_classif(species ~ ., data = palmerpenguins::penguins)
task_penguins
## <TaskClassif:palmerpenguins::penguins> (344 x 8)
## * Target: species
## * Properties: multiclass
## * Features (7):
##   - int (3): body_mass_g, flipper_length_mm, year
##   - dbl (2): bill_depth_mm, bill_length_mm
##   - fct (2): island, sex
# load learner and set hyperparameter
learner = lrn("classif.rpart", cp = .01)

Basic train + predict

# train/test split
split = partition(task_penguins, ratio = 0.67)

# train the model
learner$train(task_penguins, split$train_set)

# predict data
prediction = learner$predict(task_penguins, split$test_set)

# calculate performance
prediction$confusion
##            truth
## response    Adelie Chinstrap Gentoo
##   Adelie       146         5      0
##   Chinstrap      6        63      1
##   Gentoo         0         0    123
measure = msr("classif.acc")
prediction$score(measure)
## classif.acc 
##   0.9651163

Resample

# 3-fold cross validation
resampling = rsmp("cv", folds = 3L)

# run experiments
rr = resample(task_penguins, learner, resampling)

# access results
rr$score(measure)[, .(task_id, learner_id, iteration, classif.acc)]
##                     task_id    learner_id iteration classif.acc
## 1: palmerpenguins::penguins classif.rpart         1   0.9391304
## 2: palmerpenguins::penguins classif.rpart         2   0.9478261
## 3: palmerpenguins::penguins classif.rpart         3   0.9298246
rr$aggregate(measure)
## classif.acc 
##    0.938927

Extension Packages

Consult the wiki for short descriptions and links to the respective repositories.

For beginners, we strongly recommend to install and load the mlr3verse package for a better user experience.

Why a rewrite?

mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.

Design principles

  • Only the basic building blocks for machine learning are implemented in this package.
  • Focus on computation here. No visualization or other stuff. That can go in extra packages.
  • Overcome the limitations of R’s S3 classes with the help of R6.
  • Embrace R6 for a clean OO-design, object state-changes and reference semantics. This might be less “traditional R”, but seems to fit mlr nicely.
  • Embrace data.table for fast and convenient data frame computations.
  • Combine data.table and R6, for this we will make heavy use of list columns in data.tables.
  • Defensive programming and type safety. All user input is checked with checkmate. Return types are documented, and mechanisms popular in base R which “simplify” the result unpredictably (e.g., sapply() or drop argument in [.data.frame) are avoided.
  • Be light on dependencies. mlr3 requires the following packages at runtime:
    • parallelly: Helper functions for parallelization. No extra recursive dependencies.
    • future.apply: Resampling and benchmarking is parallelized with the future abstraction interfacing many parallel backends.
    • backports: Ensures backward compatibility with older R releases. Developed by members of the mlr team. No recursive dependencies.
    • checkmate: Fast argument checks. Developed by members of the mlr team. No extra recursive dependencies.
    • mlr3misc: Miscellaneous functions used in multiple mlr3 extension packages. Developed by the mlr team.
    • paradox: Descriptions for parameters and parameter sets. Developed by the mlr team. No extra recursive dependencies.
    • R6: Reference class objects. No recursive dependencies.
    • data.table: Extension of R’s data.frame. No recursive dependencies.
    • digest (via mlr3misc): Hash digests. No recursive dependencies.
    • uuid: Create unique string identifiers. No recursive dependencies.
    • lgr: Logging facility. No extra recursive dependencies.
    • mlr3measures: Performance measures. No extra recursive dependencies.
    • mlbench: A collection of machine learning data sets. No dependencies.
    • palmerpenguins: A classification data set about penguins, used on examples and provided as a toy task. No dependencies.
  • Reflections: Objects are queryable for properties and capabilities, allowing you to program on them.
  • Additional functionality that comes with extra dependencies:
    • To capture output, warnings and exceptions, evaluate and callr can be used.

Contributing to mlr3

This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behavior, bugs, …) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the maintainers.

Please consult the wiki for a style guide, a roxygen guide and a pull request guide.

Citing mlr3

If you use mlr3, please cite our JOSS article:

@Article{mlr3,
  title = {{mlr3}: A modern object-oriented machine learning framework in {R}},
  author = {Michel Lang and Martin Binder and Jakob Richter and Patrick Schratz and Florian Pfisterer and Stefan Coors and Quay Au and Giuseppe Casalicchio and Lars Kotthoff and Bernd Bischl},
  journal = {Journal of Open Source Software},
  year = {2019},
  month = {dec},
  doi = {10.21105/joss.01903},
  url = {https://joss.theoj.org/papers/10.21105/joss.01903},
}

More Repositories

1

mlr

Machine Learning in R
R
1,643
star
2

mlr3book

Online version of Bischl, B., Sonabend, R., Kotthoff, L., & Lang, M. (Eds.). (2024). "Applied Machine Learning Using mlr3 in R". CRC Press.
TeX
253
star
3

mlrMBO

Toolbox for Bayesian Optimization and Model-Based Optimization in R
R
187
star
4

mlr3pipelines

Dataflow Programming for Machine Learning in R
R
138
star
5

mlr3proba

Probabilistic Learning for mlr3
R
128
star
6

mlr3extralearners

Extra learners for use in mlr3.
R
90
star
7

mlr3learners

Recommended learners for mlr3
R
90
star
8

mlr-outreach

HTML
64
star
9

parallelMap

R package to interface some popular parallelization backends with a unified interface
R
57
star
10

mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
R
54
star
11

mlr3verse

Meta-package for installing/updating mlr3* packages.
R
50
star
12

mlr3spatiotempcv

Spatiotemporal resampling methods for mlr3
TeX
48
star
13

mlr3viz

Visualizations for mlr3
R
42
star
14

mlr3spatial

Spatial objects within the mlr3 ecosystem
HTML
42
star
15

mlr3torch

Deep learning framework for the mlr3 ecosystem based on torch
R
38
star
16

mlrCPO

Composable Preprocessing Operators for MLR
R
37
star
17

mlr3keras

Deep learning for mlr3
R
36
star
18

mcboost

Multi-Calibration & Multi-Accuracy Boosting for R
R
30
star
19

paradox

ParamHelpers Next Generation
R
28
star
20

ParamHelpers

Helpers for parameters in black-box optimization, tuning and machine learning.
R
26
star
21

mlr3mbo

Flexible Bayesian Optimization in R
R
25
star
22

mlr3cluster

Cluster analysis for mlr3
R
21
star
23

mlr3fselect

Feature selection package of the mlr3 ecosystem.
R
21
star
24

mlr3gallery

Case studies using mlr3
HTML
21
star
25

mlr3db

Data Backends to let mlr3 work transparently with (remote) data bases
R
21
star
26

mlr3filters

Filter-based feature selection for mlr3
R
20
star
27

bbotk

Black-box optimization framework for R.
R
20
star
28

mlr3temporal

Forecasting for mlr3
HTML
20
star
29

mlr3-learndrake

Template for using mlr3 with drake
HTML
18
star
30

mlr3hyperband

Successive Halving and Hyperband in the mlr3 ecosystem
R
18
star
31

miesmuschel

Flexible Mixed Integer Evolutionary Strategies
R
15
star
32

user2020

Material for the useR2020 tutorial
14
star
33

mlr3fairness

mlr3 extension for Fairness in Machine Learning
HTML
14
star
34

mlr3tuningspaces

Collection of search spaces for hyperparameter optimization in the mlr3 ecosystem
R
13
star
35

mlr3measures

Performance measures used in mlr3
R
13
star
36

mlr3benchmark

Analysis and tools for benchmarking in mlr3 and beyond.
R
12
star
37

mlr3misc

Miscellaneous helper functions for mlr3
R
12
star
38

farff

a faster arff parser
R
11
star
39

mlr3cheatsheets

Cheat Sheets for mlr3 and Friends
HTML
11
star
40

mlr3website

The mlr3 quarto website and accomanying R package.
R
8
star
41

rush

Parallel and distributed computing in R.
R
8
star
42

mlr-extralearner

R
8
star
43

mlr3survival

Survival analysis for mlr3
R
7
star
44

mlr3automl

R
6
star
45

mlr3-targets

R
6
star
46

mlr3oml

Connect mlr3 with OpenML
R
6
star
47

mlr3summary

R
6
star
48

mlr3learners.template

Learner from package {<package>} for mlr3
R
5
star
49

mlr3batchmark

Connector between mlr3 and batchtools
R
5
star
50

mlr3docker

Docker Image for mlr3
Dockerfile
5
star
51

mlr3ordinal

Ordinal Regression for mlr3
R
5
star
52

mlr3fda

Functional Data Analysis for mlr3
R
4
star
53

mlr3multioutput

Multiple Targets for mlr3
R
4
star
54

styler.mlr

{styler} mlr style guide
R
4
star
55

mlr3forecast

Time series forecasting for mlr3
R
4
star
56

mlr-web

HTML
3
star
57

mlr3pkgdowntemplate

pkgdown template package for mlr* packages
SCSS
2
star
58

mlr3data

Data sets used in the book, gallery, or in examples of mlr3.
R
2
star
59

mlr3inferr

Statistical methods for inference on the generalization error
R
2
star
60

mlr3ensemble

mlr3 extension package for ensemble machine learning
R
1
star
61

mlr-org-website

HTML
1
star
62

mlrcranlog

mlr-org cranlogs
R
1
star