• Stars
    star
    128
  • Rank 281,044 (Top 6 %)
  • Language
    R
  • License
    GNU Lesser Genera...
  • Created over 5 years ago
  • Updated 28 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Probabilistic Learning for mlr3

mlr3proba

Package website: release

Probabilistic Supervised Learning for mlr3.

r-cmd-check Article StackOverflow Mattermost

What is mlr3proba ?

mlr3proba is a machine learning toolkit for making probabilistic predictions within the mlr3 ecosystem. It currently supports the following tasks:

  • Probabilistic supervised regression - Supervised regression with a predictive distribution as the return type.
  • Predictive survival analysis - Survival analysis where individual predictive hazards can be queried. This is equivalent to probabilistic supervised regression with censored observations.
  • Unconditional distribution estimation, where the distribution is returned. Sub-cases are density estimation and unconditional survival estimation.

Key features of mlr3proba are

  • A unified fit/predict model interface to any probabilistic predictive model (frequentist, Bayesian, or other)
  • Pipeline/model composition
  • Task reduction strategies
  • Domain-agnostic evaluation workflows using task specific algorithmic performance measures.

mlr3proba makes use of the distr6 probability distribution interface as its probabilistic predictive return type.

Feature Overview

The current mlr3proba release focuses on survival analysis, and contains:

  • Task frameworks for survival analysis (TaskSurv)
  • A comprehensive selection of predictive survival learners (mostly via mlr3extralearners)
  • A comprehensive selection of performance measures for predictive survival learners, with respect to prognostic index (continuous rank) prediction, and probabilistic (distribution) prediction
  • PipeOps integrated with mlr3pipelines, for basic pipeline building, and reduction/composition strategies using linear predictors and baseline hazards.

Roadmap

The vision of mlr3proba is to provide comprehensive machine learning functionality to the mlr3 ecosystem for continuous probabilistic return types.

The lifecycle of the survival task and features are considered maturing and any major changes are unlikely.

The density and probabilistic supervised regression tasks are currently in the early stages of development. Task frameworks have been drawn up, but may not be stable; learners need to be interfaced, and contributions are very welcome (see issues).

Installation

mlr3proba is not on CRAN and is unlikely to be reuploaded (see here for reasons). As such you must install with one of the following methods:

Install from r-universe:

options(repos=c(
  mlrorg = 'https://mlr-org.r-universe.dev',
  raphaels1 = 'https://raphaels1.r-universe.dev',
  CRAN = 'https://cloud.r-project.org'
))
install.packages("mlr3proba")

or

install.packages("mlr3proba", repos = "https://mlr-org.r-universe.dev")

Or for easier installation going forward:

  1. Run usethis::edit_r_environ() then in the file that opened add or edit options to look something like
options(repos = c(
       raphaels1 = "https://raphaels1.r-universe.dev",
       mlrorg = "https://mlr-org.r-universe.dev",
       CRAN = 'https://cloud.r-project.org'
))
  1. Save and close the file, restart your R session
  2. Run install.packages("mlr3proba") as usual

Install from GitHub:

remotes::install_github("mlr-org/mlr3proba")

Learners

Core learners are implemented in mlr3proba, recommended common learners are implemented in mlr3learners, and many more are implemented in mlr3extralearners. Use the interactive search table to search for available survival learners and see the learner status page for their live status.

Measures

For density estimation only the log-loss is currently implemented, for survival analysis, see full list here. Some commonly used measures are the following:

ID Measure Package Type
surv.dcalib D-Calibration mlr3proba Calibration
surv.cindex Concordance Index mlr3proba Discrimination
surv.uno_auc Uno’s AUC survAUC Discrimination
surv.graf Integrated Brier Score mlr3proba Scoring Rule
surv.rcll Right-Censored Log loss mlr3proba Scoring Rule
surv.intlogloss Integrated Log Loss mlr3proba Scoring Rule

Bugs, Questions, Feedback

mlr3proba is a free and open source software project that encourages participation and feedback. If you have any issues, questions, suggestions or feedback, please do not hesitate to open an “issue” about it on the GitHub page!

In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behavior (but don’t worry about this if the bug is obvious).

Similar Projects

Predecessors to this package are previous instances of survival modelling in mlr. The skpro package in the python/scikit-learn ecosystem follows a similar interface for probabilistic supervised learning and is an architectural predecessor. Several packages exist which allow probabilistic predictive modelling with a Bayesian model specific general interface, such as rjags and stan. For implementation of a few survival models and measures, a central package is survival. There does not appear to be a package that provides an architectural framework for distribution/density estimation, see this list for a review of density estimation packages in R.

Acknowledgements

Several people contributed to the building of mlr3proba. Firstly, thanks to Michel Lang for writing mlr3survival. Several learners and measures implemented in mlr3proba, as well as the prediction, task, and measure surv objects, were written initially in mlr3survival before being absorbed into mlr3proba. Secondly thanks to Franz Kiraly for major contributions towards the design of the proba-specific parts of the package, including compositors and predict types. Also for mathematical contributions towards the scoring rules implemented in the package. Finally thanks to Bernd Bischl and the rest of the mlr core team for building mlr3 and for many conversations about the design of mlr3proba.

Citing mlr3proba

If you use mlr3proba, please cite our Bioinformatics article:

@Article{,
  title = {mlr3proba: An R Package for Machine Learning in Survival Analysis},
  author = {Raphael Sonabend and Franz J Király and Andreas Bender and Bernd Bischl and Michel Lang},
  journal = {Bioinformatics},
  month = {02},
  year = {2021},
  doi = {10.1093/bioinformatics/btab039},
  issn = {1367-4803},
}

More Repositories

1

mlr

Machine Learning in R
R
1,643
star
2

mlr3

mlr3: Machine Learning in R - next generation
R
937
star
3

mlr3book

Online version of Bischl, B., Sonabend, R., Kotthoff, L., & Lang, M. (Eds.). (2024). "Applied Machine Learning Using mlr3 in R". CRC Press.
TeX
253
star
4

mlrMBO

Toolbox for Bayesian Optimization and Model-Based Optimization in R
R
187
star
5

mlr3pipelines

Dataflow Programming for Machine Learning in R
R
138
star
6

mlr3extralearners

Extra learners for use in mlr3.
R
90
star
7

mlr3learners

Recommended learners for mlr3
R
90
star
8

mlr-outreach

HTML
64
star
9

parallelMap

R package to interface some popular parallelization backends with a unified interface
R
57
star
10

mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
R
54
star
11

mlr3verse

Meta-package for installing/updating mlr3* packages.
R
50
star
12

mlr3spatiotempcv

Spatiotemporal resampling methods for mlr3
TeX
48
star
13

mlr3viz

Visualizations for mlr3
R
42
star
14

mlr3spatial

Spatial objects within the mlr3 ecosystem
HTML
42
star
15

mlr3torch

Deep learning framework for the mlr3 ecosystem based on torch
R
38
star
16

mlrCPO

Composable Preprocessing Operators for MLR
R
37
star
17

mlr3keras

Deep learning for mlr3
R
36
star
18

mcboost

Multi-Calibration & Multi-Accuracy Boosting for R
R
30
star
19

paradox

ParamHelpers Next Generation
R
28
star
20

ParamHelpers

Helpers for parameters in black-box optimization, tuning and machine learning.
R
26
star
21

mlr3mbo

Flexible Bayesian Optimization in R
R
25
star
22

mlr3cluster

Cluster analysis for mlr3
R
21
star
23

mlr3fselect

Feature selection package of the mlr3 ecosystem.
R
21
star
24

mlr3gallery

Case studies using mlr3
HTML
21
star
25

mlr3db

Data Backends to let mlr3 work transparently with (remote) data bases
R
21
star
26

mlr3filters

Filter-based feature selection for mlr3
R
20
star
27

bbotk

Black-box optimization framework for R.
R
20
star
28

mlr3temporal

Forecasting for mlr3
HTML
20
star
29

mlr3-learndrake

Template for using mlr3 with drake
HTML
18
star
30

mlr3hyperband

Successive Halving and Hyperband in the mlr3 ecosystem
R
18
star
31

miesmuschel

Flexible Mixed Integer Evolutionary Strategies
R
15
star
32

user2020

Material for the useR2020 tutorial
14
star
33

mlr3fairness

mlr3 extension for Fairness in Machine Learning
HTML
14
star
34

mlr3tuningspaces

Collection of search spaces for hyperparameter optimization in the mlr3 ecosystem
R
13
star
35

mlr3measures

Performance measures used in mlr3
R
13
star
36

mlr3benchmark

Analysis and tools for benchmarking in mlr3 and beyond.
R
12
star
37

mlr3misc

Miscellaneous helper functions for mlr3
R
12
star
38

farff

a faster arff parser
R
11
star
39

mlr3cheatsheets

Cheat Sheets for mlr3 and Friends
HTML
11
star
40

mlr3website

The mlr3 quarto website and accomanying R package.
R
8
star
41

rush

Parallel and distributed computing in R.
R
8
star
42

mlr-extralearner

R
8
star
43

mlr3survival

Survival analysis for mlr3
R
7
star
44

mlr3automl

R
6
star
45

mlr3-targets

R
6
star
46

mlr3oml

Connect mlr3 with OpenML
R
6
star
47

mlr3summary

R
6
star
48

mlr3learners.template

Learner from package {<package>} for mlr3
R
5
star
49

mlr3batchmark

Connector between mlr3 and batchtools
R
5
star
50

mlr3docker

Docker Image for mlr3
Dockerfile
5
star
51

mlr3ordinal

Ordinal Regression for mlr3
R
5
star
52

mlr3fda

Functional Data Analysis for mlr3
R
4
star
53

mlr3multioutput

Multiple Targets for mlr3
R
4
star
54

styler.mlr

{styler} mlr style guide
R
4
star
55

mlr3forecast

Time series forecasting for mlr3
R
4
star
56

mlr-web

HTML
3
star
57

mlr3pkgdowntemplate

pkgdown template package for mlr* packages
SCSS
2
star
58

mlr3data

Data sets used in the book, gallery, or in examples of mlr3.
R
2
star
59

mlr3inferr

Statistical methods for inference on the generalization error
R
2
star
60

mlr3ensemble

mlr3 extension package for ensemble machine learning
R
1
star
61

mlr-org-website

HTML
1
star
62

mlrcranlog

mlr-org cranlogs
R
1
star