• Stars
    star
    193
  • Rank 193,684 (Top 4 %)
  • Language
    R
  • License
    Other
  • Created over 4 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Modeling Workflows

workflows

Codecov test coverage R-CMD-check

What is a workflow?

A workflow is an object that can bundle together your pre-processing, modeling, and post-processing requests. For example, if you have a recipe and parsnip model, these can be combined into a workflow. The advantages are:

  • You donโ€™t have to keep track of separate objects in your workspace.

  • The recipe prepping and model fitting can be executed using a single call to fit().

  • If you have custom tuning parameter settings, these can be defined using a simpler interface when combined with tune.

  • In the future, workflows will be able to add post-processing operations, such as modifying the probability cutoff for two-class models.

Installation

You can install workflows from CRAN with:

install.packages("workflows")

You can install the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/workflows")

Example

Suppose you were modeling data on cars. Sayโ€ฆthe fuel efficiency of 32 cars. You know that the relationship between engine displacement and miles-per-gallon is nonlinear, and you would like to model that as a spline before adding it to a Bayesian linear regression model. You might have a recipe to specify the spline:

library(recipes)
library(parsnip)
library(workflows)

spline_cars <- recipe(mpg ~ ., data = mtcars) %>% 
  step_ns(disp, deg_free = 10)

and a model object:

bayes_lm <- linear_reg() %>% 
  set_engine("stan")

To use these, you would generally run:

spline_cars_prepped <- prep(spline_cars, mtcars)
bayes_lm_fit <- fit(bayes_lm, mpg ~ ., data = juice(spline_cars_prepped))

You canโ€™t predict on new samples using bayes_lm_fit without the prepped version of spline_cars around. You also might have other models and recipes in your workspace. This might lead to getting them mixed-up or forgetting to save the model/recipe pair that you are most interested in.

workflows makes this easier by combining these objects together:

car_wflow <- workflow() %>% 
  add_recipe(spline_cars) %>% 
  add_model(bayes_lm)

Now you can prepare the recipe and estimate the model via a single call to fit():

car_wflow_fit <- fit(car_wflow, data = mtcars)

You can alter existing workflows using update_recipe() / update_model() and remove_recipe() / remove_model().

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

More Repositories

1

broom

Convert statistical analysis objects from R into tidy format
R
1,402
star
2

tidymodels

Easily install and load the tidymodels packages
R
727
star
3

infer

An R package for tidyverse-friendly statistical inference
R
702
star
4

corrr

Explore correlations in R
R
583
star
5

parsnip

A tidy unified interface to models
R
554
star
6

TMwR

Code and content for "Tidy Modeling with R"
RMarkdown
552
star
7

recipes

Pipeable steps for feature engineering and data preprocessing to prepare for modeling
R
534
star
8

yardstick

Tidy methods for measuring model performance
R
354
star
9

rsample

Classes and functions to create and summarize resampling objects
R
318
star
10

stacks

An R package for tidy stacked ensemble modeling
R
284
star
11

tidypredict

Run predictions inside the database
R
256
star
12

tune

Tools for tidy parameter tuning
R
248
star
13

textrecipes

Extra recipes for Text Processing
R
154
star
14

embed

Extra recipes for predictor embeddings
R
140
star
15

themis

Extra recipes steps for dealing with unbalanced data
R
138
star
16

butcher

Reduce the size of model objects saved to disk
R
130
star
17

censored

Parsnip wrappers for survival models
R
123
star
18

dials

Tools for creating tuning parameter values
R
110
star
19

probably

Tools for post-processing class probability estimates
R
108
star
20

tidyclust

A tidy unified interface to clustering models
R
103
star
21

tidyposterior

Bayesian comparisons of models using resampled statistics
R
101
star
22

tidymodels.org-legacy

Legacy Source of tidymodels.org
HTML
100
star
23

aml-training

The most recent version of the Applied Machine Learning notes
HTML
100
star
24

hardhat

Construct Modeling Packages
R
99
star
25

workflowsets

Create a collection of modeling workflows
R
88
star
26

usemodels

Boilerplate Code for tidymodels
R
85
star
27

modeldb

Run models inside a database using R
R
79
star
28

workshops

Website and materials for tidymodels workshops
JavaScript
76
star
29

multilevelmod

Parsnip wrappers for mixed-level and hierarchical models
R
72
star
30

spatialsample

Create and summarize spatial resampling objects ๐Ÿ—บ
R
69
star
31

learntidymodels

Learn tidymodels with interactive learnr primers
R
64
star
32

brulee

High-Level Modeling Functions with 'torch'
R
62
star
33

finetune

Additional functions for model tuning
R
61
star
34

shinymodels

R
45
star
35

applicable

Quantify extrapolation of new samples given a training set
R
43
star
36

model-implementation-principles

recommendations for creating R modeling packages
HTML
40
star
37

bonsai

parsnip wrappers for tree-based models
R
40
star
38

rules

parsnip extension for rule-based models
R
39
star
39

planning

Documents to plan and discuss future development
36
star
40

discrim

Wrappers for discriminant analysis and naive Bayes models for use with the parsnip package
R
28
star
41

baguette

parsnip Model Functions for Bagging
R
23
star
42

modeldata

Data Sets Used by tidymodels Packages
R
22
star
43

poissonreg

parsnip wrappers for Poisson regression
R
22
star
44

agua

Create and evaluate models using 'tidymodels' and 'h2o'
R
21
star
45

extratests

Integration and other testing for tidymodels
R
20
star
46

tidymodels.org

Source of tidymodels.org
JavaScript
16
star
47

plsmod

Model Wrappers for Projection Methods
R
14
star
48

cloudstart

RStudio Cloud โ˜๏ธ resources to accompany tidymodels.org
12
star
49

desirability2

Desirability Functions for Multiparameter Optimization
R
7
star
50

modeldatatoo

More Data Sets Useful for Modeling Examples
R
5
star
51

.github

GitHub contributing guidelines for tidymodels packages
4
star
52

modelenv

Provide Tools to Register Models for use in Tidymodels
R
3
star
53

survivalauc

What the Package Does (One Line, Title Case)
C
2
star