• Stars
    star
    107
  • Rank 312,639 (Top 7 %)
  • Language
    HTML
  • License
    GNU General Publi...
  • Created almost 3 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Trees are all you need

forester: Quick and Simple Tools for Training and Testing of Tree-based Models

A significant amount of time is spent on building models with high performance. Selecting the appropriate model structures, optimizing hyperparameters and explainability are only part of the process of creating a machine learning-based solution. Despite the wide range of structures considered, tree-based models are champions in competitions or hackathons. So, aren't tree-based models enough?

They definitely are and that’s why we want to fully automate the machine learning process for them, so everyone will be able to use the computational power of the trees.

Installation

From GitHub

install.packages("devtools")
devtools::install_github("ModelOriented/forester")

Additional features installation

Some of the package dependencies are not present on CRAN, which means that the user has to follow the installation mentioned below. They should be especially helpful for macOS users:

catboost

The catboost model is used in the train() function as an additional engine.

devtools::install_url('https://github.com/catboost/catboost/releases/download/v1.1.1/catboost-R-Darwin-1.1.1.tgz', INSTALL_opts = c("--no-multiarch", "--no-test-load", "--no-staged-install"))

Alternatively one can do a longer installation, where whole repository will be downloaded.

devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')

ggradar

The ggradar is required for creating radar plot visualization in the report from the report() function.

devtools::install_github('ricardo-bion/ggradar', dependencies = TRUE)

tinytex

The tinytex is required for creating a report from the report() function.

install.packages('tinytex')
tinytex::install_tinytex()

How to build tree-based models in R?

What is the forester?

đź’ˇ full automation of the process of training tree-based models

đź’ˇ no demand for ML expertise

đź’ˇ powerful tool for making high-quality baseline models for experienced users

The forester package is an AutoML tool in R that wraps up all machine learning processes into a single train() function, which includes:

  • rendering a brief data check report,
  • preprocessing initial dataset enough for models to be trained,
  • training 5 tree-based models with default parameters, random search and Bayesian optimisation,
  • evaluating them and providing a ranked list.

For whom is this package created?

The forester package is designed for beginners in data science, but also for more experienced users. They get an easy-to-use tool that can be used to prepare high-quality baseline models for comparison with more advanced methods or a set of output parameters for more thorough optimisations.

Introductory blogs

Authors

This package is created inside the MI2.AI (Warsaw University of Technology) as both scientific research and Bachelor thesis by:

Project co-ordinator and supervisor: Anna Kozak

Auxiliary supervisor Przemysław Biecek

The previous version of forester was created by:

  • Hoang Thien Ly
  • Szymon SzmajdziĹ„ski

More Repositories

1

DALEX

moDel Agnostic Language for Exploration and eXplanation
Python
1,318
star
2

DrWhy

DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.
R
670
star
3

modelStudio

đź“Ť Interactive Studio for Explanatory Model Analysis
R
318
star
4

randomForestExplainer

A set of tools to understand what is happening inside a Random Forest
R
226
star
5

modelDown

modelDown generates a website with HTML summaries for predictive models
R
119
star
6

survex

Explainable Machine Learning in Survival Analysis
R
89
star
7

fairmodels

Flexible tool for bias detection, visualization, and mitigation
R
82
star
8

iBreakDown

Break Down with interactions for local explanations (SHAP, BreakDown, iBreakDown)
R
79
star
9

treeshap

Compute SHAP values for your tree-based models using the TreeSHAP algorithm
R
75
star
10

shapviz

R package for SHAP plots
R
63
star
11

DALEXtra

Extensions for the DALEX package
R
62
star
12

auditor

Model verification, validation, and error analysis
R
58
star
13

shapper

An R wrapper of SHAP python library
R
58
star
14

ingredients

Effects and Importances of Model Ingredients
R
37
star
15

live

Local Interpretable (Model-agnostic) Visual Explanations - model visualization for regression problems and tabular data based on LIME method. Available on CRAN
R
34
star
16

SAFE

Surrogate Assisted Feature Extraction
Python
33
star
17

DALEX-docs

Documentation for the DALEX project
Jupyter Notebook
33
star
18

kernelshap

Efficient R implementation of SHAP
R
30
star
19

ArenaR

Data generator for Arena - interactive XAI dashboard
R
29
star
20

rSAFE

Surrogate Assisted Feature Extraction in R
R
28
star
21

EIX

Structure mining for xgboost model
R
25
star
22

factorMerger

Set of tools to support results from post hoc testing
R
24
star
23

EMMA

Evaluation of Methods for dealing with Missing data in Machine Learning algorithms
HTML
23
star
24

xspliner

Explain black box with GLM
R
23
star
25

EloML

R package EloML: Elo rating system for machine learning models
R
23
star
26

Arena

Interactive XAI dashboard
Vue
22
star
27

MAIR

Monitoring of AI Regulations
HTML
19
star
28

pyCeterisParibus

Python library for Ceteris Paribus Plots (What-if plots)
Python
19
star
29

xai2shiny

Create Shiny application with model exploration from explainers
R
19
star
30

drifter

Concept Drift and Concept Shift Detection for Predictive Models
R
18
star
31

localModel

LIME-like explanations with interpretable features based on Ceteris Paribus curves. Now on CRAN.
R
14
star
32

vivo

Variable importance via oscillations
R
14
star
33

corrgrapher

Visualize correlations between variables
R
13
star
34

metaMIMIC

Jupyter Notebook
10
star
35

EvidenceBasedML

Evidence-Based Machine Learning
9
star
36

weles

Python
9
star
37

triplot

Triplot: Instance- and data-level explanations for the groups of correlated features.
R
9
star
38

xai2cloud

Create web API from model explainers
R
8
star
39

FairPAN

R
7
star
40

AI-strategies-papers-regulations-monitoring

Monitoring of AI strategies, papers, and regulations
Jupyter Notebook
7
star
41

xaibot

XAI chat bot for Titanic model - created with plumber
JavaScript
7
star
42

piBreakDown

python version of iBreakDown
Python
4
star
43

RME

Recurrent Memory Explainer
Python
3
star
44

mogger

Logger for Predictive Models
Java
2
star
45

ceterisParibus2

Very experimental version of the ceterisParibus package.
Jupyter Notebook
2
star
46

DrWhyTemplate

CSS
2
star
47

shimex

R Package for Exploring Models with Shiny App
R
2
star
48

DALEX2

Explain! Package with core wrappers for DrWhy universe.
R
2
star
49

ModelDevelopmentProcess

Source codes for Model Development Process plots
HTML
1
star
50

Hex4DrWhy

Shiny app for logo prototyping
R
1
star