• Stars
    star
    316
  • Rank 121,180 (Top 3 %)
  • Language
    R
  • License
    Other
  • Created over 6 years ago
  • Updated 28 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Classes and functions to create and summarize resampling objects

rsample a boot on a green background

R-CMD-check Codecov test coverage CRAN_Status_Badge Downloads lifecycle

Overview

The rsample package provides functions to create different types of resamples and corresponding classes for their analysis. The goal is to have a modular set of methods that can be used for:

  • resampling for estimating the sampling distribution of a statistic
  • estimating model performance using a holdout set

The scope of rsample is to provide the basic building blocks for creating and analyzing resamples of a data set, but this package does not include code for modeling or calculating statistics. The Working with Resample Sets vignette gives a demonstration of how rsample tools can be used when building models.

Note that resampled data sets created by rsample are directly accessible in a resampling object but do not contain much overhead in memory. Since the original data is not modified, R does not make an automatic copy.

For example, creating 50 bootstraps of a data set does not create an object that is 50-fold larger in memory:

library(rsample)
library(mlbench)

data(LetterRecognition)
lobstr::obj_size(LetterRecognition)
#> 2,644,640 B

set.seed(35222)
boots <- bootstraps(LetterRecognition, times = 50)
lobstr::obj_size(boots)
#> 6,686,776 B

# Object size per resample
lobstr::obj_size(boots)/nrow(boots)
#> 133,735.5 B

# Fold increase is <<< 50
as.numeric(lobstr::obj_size(boots)/lobstr::obj_size(LetterRecognition))
#> [1] 2.528426

Created on 2022-02-28 by the reprex package (v2.0.1)

The memory usage for 50 bootstrap samples is less than 3-fold more than the original data set.

Installation

To install it, use:

install.packages("rsample")

And the development version from GitHub with:

# install.packages("pak")
pak::pak("rsample")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

More Repositories

1

broom

Convert statistical analysis objects from R into tidy format
R
1,383
star
2

infer

An R package for tidyverse-friendly statistical inference
R
688
star
3

tidymodels

Easily install and load the tidymodels packages
R
679
star
4

corrr

Explore correlations in R
R
578
star
5

TMwR

Code and content for "Tidy Modeling with R"
RMarkdown
541
star
6

parsnip

A tidy unified interface to models
R
535
star
7

recipes

Pipeable steps for feature engineering and data preprocessing to prepare for modeling
R
498
star
8

yardstick

Tidy methods for measuring model performance
R
338
star
9

stacks

An R package for tidy stacked ensemble modeling
R
282
star
10

tidypredict

Run predictions inside the database
R
251
star
11

tune

Tools for tidy parameter tuning
R
224
star
12

workflows

Modeling Workflows
R
188
star
13

textrecipes

Extra recipes for Text Processing
R
150
star
14

embed

Extra recipes for predictor embeddings
R
138
star
15

themis

Extra recipes steps for dealing with unbalanced data
R
133
star
16

butcher

Reduce the size of model objects saved to disk
R
124
star
17

censored

Parsnip wrappers for survival models
R
115
star
18

dials

Tools for creating tuning parameter values
R
108
star
19

probably

Tools for post-processing class probability estimates
R
102
star
20

tidyposterior

Bayesian comparisons of models using resampled statistics
R
102
star
21

tidymodels.org-legacy

Legacy Source of tidymodels.org
HTML
101
star
22

aml-training

The most recent version of the Applied Machine Learning notes
HTML
101
star
23

hardhat

Construct Modeling Packages
R
98
star
24

tidyclust

A tidy unified interface to clustering models
R
93
star
25

usemodels

Boilerplate Code for tidymodels
R
84
star
26

workflowsets

Create a collection of modeling workflows
R
83
star
27

modeldb

Run models inside a database using R
R
77
star
28

multilevelmod

Parsnip wrappers for mixed-level and hierarchical models
R
69
star
29

workshops

Website and materials for tidymodels workshops
JavaScript
63
star
30

spatialsample

Create and summarize spatial resampling objects πŸ—Ί
R
60
star
31

finetune

Additional functions for model tuning
R
59
star
32

brulee

High-Level Modeling Functions with 'torch'
R
55
star
33

learntidymodels

Learn tidymodels with interactive learnr primers
R
54
star
34

applicable

Quantify extrapolation of new samples given a training set
R
42
star
35

model-implementation-principles

recommendations for creating R modeling packages
HTML
41
star
36

shinymodels

R
40
star
37

rules

parsnip extension for rule-based models
R
38
star
38

planning

Documents to plan and discuss future development
35
star
39

bonsai

parsnip wrappers for tree-based models
R
33
star
40

discrim

Wrappers for discriminant analysis and naive Bayes models for use with the parsnip package
R
27
star
41

poissonreg

parsnip wrappers for Poisson regression
R
22
star
42

baguette

parsnip Model Functions for Bagging
R
21
star
43

modeldata

Data Sets Used by tidymodels Packages
R
21
star
44

agua

Create and evaluate models using 'tidymodels' and 'h2o'
R
19
star
45

plsmod

Model Wrappers for Projection Methods
R
13
star
46

cloudstart

RStudio Cloud ☁️ resources to accompany tidymodels.org
12
star
47

extratests

Integration and other testing for tidymodels
R
11
star
48

tidymodels.org

Source of tidymodels.org
JavaScript
8
star
49

desirability2

Desirability Functions for Multiparameter Optimization
R
7
star
50

.github

GitHub contributing guidelines for tidymodels packages
4
star
51

modelenv

Provide Tools to Register Models for use in Tidymodels
R
3
star
52

survivalauc

What the Package Does (One Line, Title Case)
C
2
star
53

modeldatatoo

More Data Sets Useful for Modeling Examples
R
1
star