• Stars
    star
    112
  • Rank 312,240 (Top 7 %)
  • Language
    HTML
  • License
    GNU General Publi...
  • Created over 3 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Trees are all you need

forester: Quick and Simple Tools for Training and Testing of Tree-based Models

A significant amount of time is spent on building models with high performance. Selecting the appropriate model structures, optimizing hyperparameters and explainability are only part of the process of creating a machine learning-based solution. Despite the wide range of structures considered, tree-based models are champions in competitions or hackathons. So, aren't tree-based models enough?

They definitely are and that’s why we want to fully automate the machine learning process for them, so everyone will be able to use the computational power of the trees.

Installation

From GitHub

install.packages("devtools")
devtools::install_github("ModelOriented/forester")

Additional features installation

Some of the package dependencies are not present on CRAN, which means that the user has to follow the installation mentioned below. They should be especially helpful for macOS users:

catboost

The catboost model is used in the train() function as an additional engine.

devtools::install_url('https://github.com/catboost/catboost/releases/download/v1.1.1/catboost-R-Darwin-1.1.1.tgz', INSTALL_opts = c("--no-multiarch", "--no-test-load", "--no-staged-install"))

Alternatively one can do a longer installation, where whole repository will be downloaded.

devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')

ggradar

The ggradar is required for creating radar plot visualization in the report from the report() function.

devtools::install_github('ricardo-bion/ggradar', dependencies = TRUE)

tinytex

The tinytex is required for creating a report from the report() function.

install.packages('tinytex')
tinytex::install_tinytex()

How to build tree-based models in R?

What is the forester?

πŸ’‘ full automation of the process of training tree-based models

πŸ’‘ no demand for ML expertise

πŸ’‘ powerful tool for making high-quality baseline models for experienced users

The forester package is an AutoML tool in R that wraps up all machine learning processes into a single train() function, which includes:

  • rendering a brief data check report,
  • preprocessing initial dataset enough for models to be trained,
  • training 5 tree-based models with default parameters, random search and Bayesian optimisation,
  • evaluating them and providing a ranked list.

For whom is this package created?

The forester package is designed for beginners in data science, but also for more experienced users. They get an easy-to-use tool that can be used to prepare high-quality baseline models for comparison with more advanced methods or a set of output parameters for more thorough optimisations.

Introductory blogs

Authors

This package is created inside the MI2.AI (Warsaw University of Technology) as both scientific research and Bachelor thesis by:

Project co-ordinator and supervisor: Anna Kozak

Auxiliary supervisor PrzemysΕ‚aw Biecek

The previous version of forester was created by:

  • Hoang Thien Ly
  • Szymon SzmajdziΕ„ski

More Repositories

1

DALEX

moDel Agnostic Language for Exploration and eXplanation
Python
1,364
star
2

DrWhy

DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.
R
679
star
3

modelStudio

πŸ“ Interactive Studio for Explanatory Model Analysis
R
323
star
4

randomForestExplainer

A set of tools to understand what is happening inside a Random Forest
R
230
star
5

modelDown

modelDown generates a website with HTML summaries for predictive models
R
119
star
6

survex

Explainable Machine Learning in Survival Analysis
R
97
star
7

fairmodels

Flexible tool for bias detection, visualization, and mitigation
R
86
star
8

iBreakDown

Break Down with interactions for local explanations (SHAP, BreakDown, iBreakDown)
R
79
star
9

treeshap

Compute SHAP values for your tree-based models using the TreeSHAP algorithm
R
78
star
10

shapviz

SHAP Plots in R
R
77
star
11

DALEXtra

Extensions for the DALEX package
R
65
star
12

auditor

Model verification, validation, and error analysis
R
58
star
13

shapper

An R wrapper of SHAP python library
R
58
star
14

ingredients

Effects and Importances of Model Ingredients
R
37
star
15

SAFE

Surrogate Assisted Feature Extraction
Python
36
star
16

kernelshap

Different SHAP algorithms
R
36
star
17

DALEX-docs

Documentation for the DALEX project
Jupyter Notebook
34
star
18

live

Local Interpretable (Model-agnostic) Visual Explanations - model visualization for regression problems and tabular data based on LIME method. Available on CRAN
R
34
star
19

ArenaR

Data generator for Arena - interactive XAI dashboard
R
30
star
20

rSAFE

Surrogate Assisted Feature Extraction in R
R
28
star
21

EIX

Structure mining for xgboost model
R
25
star
22

factorMerger

Set of tools to support results from post hoc testing
R
24
star
23

EloML

R package EloML: Elo rating system for machine learning models
R
24
star
24

EMMA

Evaluation of Methods for dealing with Missing data in Machine Learning algorithms
HTML
23
star
25

xspliner

Explain black box with GLM
R
23
star
26

hstats

Friedman's H-statistics
R
23
star
27

Arena

Interactive XAI dashboard
Vue
22
star
28

MAIR

Monitoring of AI Regulations
HTML
19
star
29

pyCeterisParibus

Python library for Ceteris Paribus Plots (What-if plots)
Python
19
star
30

drifter

Concept Drift and Concept Shift Detection for Predictive Models
R
19
star
31

xai2shiny

Create Shiny application with model exploration from explainers
R
18
star
32

localModel

LIME-like explanations with interpretable features based on Ceteris Paribus curves. Now on CRAN.
R
14
star
33

vivo

Variable importance via oscillations
R
14
star
34

corrgrapher

Visualize correlations between variables
R
13
star
35

metaMIMIC

Jupyter Notebook
12
star
36

EvidenceBasedML

Evidence-Based Machine Learning
9
star
37

weles

Python
9
star
38

triplot

Triplot: Instance- and data-level explanations for the groups of correlated features.
R
9
star
39

xai2cloud

Create web API from model explainers
R
8
star
40

xaibot

XAI chat bot for Titanic model - created with plumber
JavaScript
8
star
41

FairPAN

R
7
star
42

AI-strategies-papers-regulations-monitoring

Monitoring of AI strategies, papers, and regulations
Jupyter Notebook
7
star
43

piBreakDown

python version of iBreakDown
Python
4
star
44

RME

Recurrent Memory Explainer
Python
3
star
45

mogger

Logger for Predictive Models
Java
2
star
46

ceterisParibus2

Very experimental version of the ceterisParibus package.
Jupyter Notebook
2
star
47

DrWhyTemplate

CSS
2
star
48

shimex

R Package for Exploring Models with Shiny App
R
2
star
49

DALEX2

Explain! Package with core wrappers for DrWhy universe.
R
2
star
50

ModelDevelopmentProcess

Source codes for Model Development Process plots
HTML
1
star
51

Hex4DrWhy

Shiny app for logo prototyping
R
1
star