• Stars
    star
    201
  • Rank 194,491 (Top 4 %)
  • Language
    R
  • License
    Other
  • Created about 5 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Modeling Workflows

workflows

Codecov test coverage R-CMD-check

What is a workflow?

A workflow is an object that can bundle together your pre-processing, modeling, and post-processing requests. For example, if you have a recipe and parsnip model, these can be combined into a workflow. The advantages are:

  • You donโ€™t have to keep track of separate objects in your workspace.

  • The recipe prepping and model fitting can be executed using a single call to fit().

  • If you have custom tuning parameter settings, these can be defined using a simpler interface when combined with tune.

  • In the future, workflows will be able to add post-processing operations, such as modifying the probability cutoff for two-class models.

Installation

You can install workflows from CRAN with:

install.packages("workflows")

You can install the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/workflows")

Example

Suppose you were modeling data on cars. Sayโ€ฆthe fuel efficiency of 32 cars. You know that the relationship between engine displacement and miles-per-gallon is nonlinear, and you would like to model that as a spline before adding it to a Bayesian linear regression model. You might have a recipe to specify the spline:

library(recipes)
library(parsnip)
library(workflows)

spline_cars <- recipe(mpg ~ ., data = mtcars) %>% 
  step_ns(disp, deg_free = 10)

and a model object:

bayes_lm <- linear_reg() %>% 
  set_engine("stan")

To use these, you would generally run:

spline_cars_prepped <- prep(spline_cars, mtcars)
bayes_lm_fit <- fit(bayes_lm, mpg ~ ., data = juice(spline_cars_prepped))

You canโ€™t predict on new samples using bayes_lm_fit without the prepped version of spline_cars around. You also might have other models and recipes in your workspace. This might lead to getting them mixed-up or forgetting to save the model/recipe pair that you are most interested in.

workflows makes this easier by combining these objects together:

car_wflow <- workflow() %>% 
  add_recipe(spline_cars) %>% 
  add_model(bayes_lm)

Now you can prepare the recipe and estimate the model via a single call to fit():

car_wflow_fit <- fit(car_wflow, data = mtcars)

You can alter existing workflows using update_recipe() / update_model() and remove_recipe() / remove_model().

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

More Repositories

1

broom

Convert statistical analysis objects from R into tidy format
R
1,445
star
2

tidymodels

Easily install and load the tidymodels packages
R
747
star
3

infer

An R package for tidyverse-friendly statistical inference
R
723
star
4

parsnip

A tidy unified interface to models
R
595
star
5

corrr

Explore correlations in R
R
588
star
6

TMwR

Code and content for "Tidy Modeling with R"
RMarkdown
585
star
7

recipes

Pipeable steps for feature engineering and data preprocessing to prepare for modeling
R
558
star
8

yardstick

Tidy methods for measuring model performance
R
367
star
9

rsample

Classes and functions to create and summarize resampling objects
R
335
star
10

stacks

An R package for tidy stacked ensemble modeling
R
294
star
11

tune

Tools for tidy parameter tuning
R
268
star
12

tidypredict

Run predictions inside the database
R
259
star
13

textrecipes

Extra recipes for Text Processing
R
159
star
14

themis

Extra recipes steps for dealing with unbalanced data
R
141
star
15

embed

Extra recipes for predictor embeddings
R
141
star
16

butcher

Reduce the size of model objects saved to disk
R
130
star
17

censored

Parsnip wrappers for survival models
R
123
star
18

probably

Tools for post-processing class probability estimates
R
113
star
19

dials

Tools for creating tuning parameter values
R
111
star
20

tidyclust

A tidy unified interface to clustering models
R
107
star
21

tidyposterior

Bayesian comparisons of models using resampled statistics
R
102
star
22

hardhat

Construct Modeling Packages
R
100
star
23

aml-training

The most recent version of the Applied Machine Learning notes
HTML
100
star
24

tidymodels.org-legacy

Legacy Source of tidymodels.org
HTML
99
star
25

workshops

Website and materials for tidymodels workshops
JavaScript
92
star
26

workflowsets

Create a collection of modeling workflows
R
92
star
27

usemodels

Boilerplate Code for tidymodels
R
85
star
28

modeldb

Run models inside a database using R
R
80
star
29

multilevelmod

Parsnip wrappers for mixed-level and hierarchical models
R
74
star
30

spatialsample

Create and summarize spatial resampling objects ๐Ÿ—บ
R
70
star
31

learntidymodels

Learn tidymodels with interactive learnr primers
R
68
star
32

brulee

High-Level Modeling Functions with 'torch'
R
67
star
33

finetune

Additional functions for model tuning
R
62
star
34

bonsai

parsnip wrappers for tree-based models
R
51
star
35

shinymodels

R
46
star
36

applicable

Quantify extrapolation of new samples given a training set
R
46
star
37

model-implementation-principles

recommendations for creating R modeling packages
HTML
41
star
38

rules

parsnip extension for rule-based models
R
40
star
39

planning

Documents to plan and discuss future development
37
star
40

discrim

Wrappers for discriminant analysis and naive Bayes models for use with the parsnip package
R
28
star
41

baguette

parsnip Model Functions for Bagging
R
24
star
42

modeldata

Data Sets Used by tidymodels Packages
R
22
star
43

poissonreg

parsnip wrappers for Poisson regression
R
22
star
44

agua

Create and evaluate models using 'tidymodels' and 'h2o'
R
21
star
45

extratests

Integration and other testing for tidymodels
R
20
star
46

tidymodels.org

Source of tidymodels.org
JavaScript
19
star
47

plsmod

Model Wrappers for Projection Methods
R
14
star
48

cloudstart

RStudio Cloud โ˜๏ธ resources to accompany tidymodels.org
12
star
49

orbital

Turn Tidymodels Workflows Into Series of Equations
R
12
star
50

desirability2

Desirability Functions for Multiparameter Optimization
R
10
star
51

modeldatatoo

More Data Sets Useful for Modeling Examples
R
7
star
52

.github

GitHub contributing guidelines for tidymodels packages
4
star
53

modelenv

Provide Tools to Register Models for use in Tidymodels
R
4
star
54

tailor

Sandbox for a postprocessor object.
R
2
star
55

survivalauc

What the Package Does (One Line, Title Case)
C
2
star