• Stars
    star
    188
  • Rank 188,289 (Top 4 %)
  • Language
    R
  • License
    Other
  • Created about 4 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Modeling Workflows

workflows

Codecov test coverage R-CMD-check

What is a workflow?

A workflow is an object that can bundle together your pre-processing, modeling, and post-processing requests. For example, if you have a recipe and parsnip model, these can be combined into a workflow. The advantages are:

  • You donโ€™t have to keep track of separate objects in your workspace.

  • The recipe prepping and model fitting can be executed using a single call to fit().

  • If you have custom tuning parameter settings, these can be defined using a simpler interface when combined with tune.

  • In the future, workflows will be able to add post-processing operations, such as modifying the probability cutoff for two-class models.

Installation

You can install workflows from CRAN with:

install.packages("workflows")

You can install the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/workflows")

Example

Suppose you were modeling data on cars. Sayโ€ฆthe fuel efficiency of 32 cars. You know that the relationship between engine displacement and miles-per-gallon is nonlinear, and you would like to model that as a spline before adding it to a Bayesian linear regression model. You might have a recipe to specify the spline:

library(recipes)
library(parsnip)
library(workflows)

spline_cars <- recipe(mpg ~ ., data = mtcars) %>% 
  step_ns(disp, deg_free = 10)

and a model object:

bayes_lm <- linear_reg() %>% 
  set_engine("stan")

To use these, you would generally run:

spline_cars_prepped <- prep(spline_cars, mtcars)
bayes_lm_fit <- fit(bayes_lm, mpg ~ ., data = juice(spline_cars_prepped))

You canโ€™t predict on new samples using bayes_lm_fit without the prepped version of spline_cars around. You also might have other models and recipes in your workspace. This might lead to getting them mixed-up or forgetting to save the model/recipe pair that you are most interested in.

workflows makes this easier by combining these objects together:

car_wflow <- workflow() %>% 
  add_recipe(spline_cars) %>% 
  add_model(bayes_lm)

Now you can prepare the recipe and estimate the model via a single call to fit():

car_wflow_fit <- fit(car_wflow, data = mtcars)

You can alter existing workflows using update_recipe() / update_model() and remove_recipe() / remove_model().

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

More Repositories

1

broom

Convert statistical analysis objects from R into tidy format
R
1,383
star
2

infer

An R package for tidyverse-friendly statistical inference
R
688
star
3

tidymodels

Easily install and load the tidymodels packages
R
679
star
4

corrr

Explore correlations in R
R
578
star
5

TMwR

Code and content for "Tidy Modeling with R"
RMarkdown
541
star
6

parsnip

A tidy unified interface to models
R
535
star
7

recipes

Pipeable steps for feature engineering and data preprocessing to prepare for modeling
R
498
star
8

yardstick

Tidy methods for measuring model performance
R
338
star
9

rsample

Classes and functions to create and summarize resampling objects
R
316
star
10

stacks

An R package for tidy stacked ensemble modeling
R
279
star
11

tidypredict

Run predictions inside the database
R
251
star
12

tune

Tools for tidy parameter tuning
R
224
star
13

textrecipes

Extra recipes for Text Processing
R
150
star
14

embed

Extra recipes for predictor embeddings
R
138
star
15

themis

Extra recipes steps for dealing with unbalanced data
R
133
star
16

butcher

Reduce the size of model objects saved to disk
R
124
star
17

censored

Parsnip wrappers for survival models
R
115
star
18

dials

Tools for creating tuning parameter values
R
108
star
19

probably

Tools for post-processing class probability estimates
R
102
star
20

tidyposterior

Bayesian comparisons of models using resampled statistics
R
102
star
21

tidymodels.org-legacy

Legacy Source of tidymodels.org
HTML
101
star
22

aml-training

The most recent version of the Applied Machine Learning notes
HTML
101
star
23

hardhat

Construct Modeling Packages
R
98
star
24

tidyclust

A tidy unified interface to clustering models
R
93
star
25

usemodels

Boilerplate Code for tidymodels
R
84
star
26

workflowsets

Create a collection of modeling workflows
R
83
star
27

modeldb

Run models inside a database using R
R
77
star
28

multilevelmod

Parsnip wrappers for mixed-level and hierarchical models
R
69
star
29

workshops

Website and materials for tidymodels workshops
JavaScript
63
star
30

spatialsample

Create and summarize spatial resampling objects ๐Ÿ—บ
R
60
star
31

finetune

Additional functions for model tuning
R
59
star
32

brulee

High-Level Modeling Functions with 'torch'
R
55
star
33

learntidymodels

Learn tidymodels with interactive learnr primers
R
54
star
34

applicable

Quantify extrapolation of new samples given a training set
R
42
star
35

model-implementation-principles

recommendations for creating R modeling packages
HTML
41
star
36

shinymodels

R
40
star
37

rules

parsnip extension for rule-based models
R
38
star
38

planning

Documents to plan and discuss future development
35
star
39

bonsai

parsnip wrappers for tree-based models
R
33
star
40

discrim

Wrappers for discriminant analysis and naive Bayes models for use with the parsnip package
R
27
star
41

poissonreg

parsnip wrappers for Poisson regression
R
22
star
42

baguette

parsnip Model Functions for Bagging
R
21
star
43

modeldata

Data Sets Used by tidymodels Packages
R
21
star
44

agua

Create and evaluate models using 'tidymodels' and 'h2o'
R
19
star
45

plsmod

Model Wrappers for Projection Methods
R
13
star
46

cloudstart

RStudio Cloud โ˜๏ธ resources to accompany tidymodels.org
12
star
47

extratests

Integration and other testing for tidymodels
R
11
star
48

tidymodels.org

Source of tidymodels.org
JavaScript
8
star
49

desirability2

Desirability Functions for Multiparameter Optimization
R
7
star
50

.github

GitHub contributing guidelines for tidymodels packages
4
star
51

modelenv

Provide Tools to Register Models for use in Tidymodels
R
3
star
52

survivalauc

What the Package Does (One Line, Title Case)
C
2
star
53

modeldatatoo

More Data Sets Useful for Modeling Examples
R
1
star