• Stars
    star
    135
  • Rank 269,297 (Top 6 %)
  • Language
    R
  • License
    Other
  • Created almost 8 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Wrap R for Sweet R Code

CRAN_Status_Badge status

wrapr is an R package that supplies powerful tools for writing and debugging R code.

Introduction

Primary wrapr services include:

library(wrapr)
packageVersion("wrapr")
 #  [1] '2.0.8'
date()
 #  [1] "Thu Jun 10 13:49:30 2021"

%.>% (dot pipe or dot arrow)

%.>% dot arrow pipe is a pipe with intended semantics:

β€œa %.>% b” is to be treated approximately as if the user had written β€œ{ . <- a; b };” with β€œ%.>%” being treated as left-associative.

Other R pipes include magrittr and pipeR.

The following two expressions should be equivalent:

cos(exp(sin(4)))
 #  [1] 0.8919465

4 %.>% sin(.) %.>% exp(.) %.>% cos(.)
 #  [1] 0.8919465

The notation is quite powerful as it treats pipe stages as expression parameterized over the variable β€œ.”. This means you do not need to introduce functions to express stages. The following is a valid dot-pipe:

1:4 %.>% .^2 
 #  [1]  1  4  9 16

The notation is also very regular as we show below.

1:4 %.>% sin
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% sin(.)
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% base::sin
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% base::sin(.)
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025

1:4 %.>% function(x) { x + 1 }
 #  [1] 2 3 4 5
1:4 %.>% (function(x) { x + 1 })
 #  [1] 2 3 4 5

1:4 %.>% { .^2 } 
 #  [1]  1  4  9 16
1:4 %.>% ( .^2 )
 #  [1]  1  4  9 16

Regularity can be a big advantage in teaching and comprehension. Please see β€œIn Praise of Syntactic Sugar” for more details. Some formal documentation can be found here.

  • Some obvious β€œdot-free”" right-hand sides are rejected. Pipelines are meant to move values through a sequence of transforms, and not just for side-effects. Example: `5 %.>% 6` deliberately stops as `6` is a right-hand side that obviously does not use its incoming value. This check is only applied to values, not functions on the right-hand side.
  • Trying to pipe into a an β€œzero argument function evaluation expression” such as `sin()` is prohibited as it looks too much like the user declaring `sin()` takes no arguments. One must pipe into either a function, function name, or an non-trivial expression (such as `sin(.)`). A useful error message is returned to the user: `wrapr::pipe does not allow direct piping into a no-argument function call expression (such as "sin()" please use sin(.))`.
  • Some reserved words can not be piped into. One example is `5 %.>% return(.)` is prohibited as the obvious pipe implementation would not actually escape from user functions as users may intend.
  • Obvious de-references (such as `$`, `::`, `@`, and a few more) on the right-hand side are treated performed (example: `5 %.>% base::sin(.)`).
  • Outer parenthesis on the right-hand side are removed (example: `5 %.>% (sin(.))`).
  • Anonymous function constructions are evaluated so the function can be applied (example: `5 %.>% function(x) {x+1}` returns 6, just as `5 %.>% (function(x) {x+1})(.)` does).
  • Checks and transforms are not performed on items inside braces (example: `5 %.>% { function(x) {x+1} }` returns `function(x) {x+1}`, not 6).
  • The dot arrow pipe has S3/S4 dispatch (please see ). However as the right-hand side of the pipe is normally held unevaluated, we don’t know the type except in special cases (such as the rigth-hand side being referred to by a name or variable). To force the evaluation of a pipe term, simply wrap it in `.()`.

The dot pipe is also user configurable through standard S3/S4 methods.

The dot pipe has been formally written up in the R Journal.

@article{RJ-2018-042,
  author = {John Mount and Nina Zumel},
  title = {{Dot-Pipe: an S3 Extensible Pipe for R}},
  year = {2018},
  journal = {{The R Journal}},
  url = {https://journal.r-project.org/archive/2018/RJ-2018-042/index.html}
}

unpack/to multiple assignments

Unpack a named list into the current environment by name (for a positional based multiple assignment operator please see zeallot, for another named base multiple assigment please see vadr::bind).

d <- data.frame(
  x = 1:9,
  group = c('train', 'calibrate', 'test'),
  stringsAsFactors = FALSE)

unpack[
  train_data = train,
  calibrate_data = calibrate,
  test_data = test
  ] := split(d, d$group)

knitr::kable(train_data)
x group
1 1 train
4 4 train
7 7 train

as_named_list

Build up named lists. Very convenient for managing workspaces when used with used with unpack/to.

as_named_list(train_data, calibrate_data, test_data)
 #  $train_data
 #    x group
 #  1 1 train
 #  4 4 train
 #  7 7 train
 #  
 #  $calibrate_data
 #    x     group
 #  2 2 calibrate
 #  5 5 calibrate
 #  8 8 calibrate
 #  
 #  $test_data
 #    x group
 #  3 3  test
 #  6 6  test
 #  9 9  test

build_frame() / draw_frame()

build_frame() is a convenient way to type in a small example data.frame in natural row order. This can be very legible and saves having to perform a transpose in one’s head. draw_frame() is the complimentary function that formats a given data.frame (and is a great way to produce neatened examples).

x <- build_frame(
   "measure"                   , "training", "validation" |
   "minus binary cross entropy", 5         , -7           |
   "accuracy"                  , 0.8       , 0.6          )
print(x)
 #                       measure training validation
 #  1 minus binary cross entropy      5.0       -7.0
 #  2                   accuracy      0.8        0.6
str(x)
 #  'data.frame':   2 obs. of  3 variables:
 #   $ measure   : chr  "minus binary cross entropy" "accuracy"
 #   $ training  : num  5 0.8
 #   $ validation: num  -7 0.6
cat(draw_frame(x))
 #  x <- wrapr::build_frame(
 #     "measure"                     , "training", "validation" |
 #       "minus binary cross entropy", 5         , -7           |
 #       "accuracy"                  , 0.8       , 0.6          )

qc() (quoting concatenate)

qc() is a quoting variation on R’s concatenate operator c(). This code such as the following:

qc(a = x, b = y)
 #    a   b 
 #  "x" "y"

qc(one, two, three)
 #  [1] "one"   "two"   "three"

qc() also allows bquote() driven .()-style argument escaping.

aname <- "I_am_a"
yvalue <- "six"

qc(.(aname) := x, b = .(yvalue))
 #  I_am_a      b 
 #     "x"  "six"

Notice the := notation is required for syntacitic reasons.

:= (named map builder)

:= is the β€œnamed map builder”. It allows code such as the following:

'a' := 'x'
 #    a 
 #  "x"

The important property of named map builder is it accepts values on the left-hand side allowing the following:

name <- 'variableNameFromElsewhere'
name := 'newBinding'
 #  variableNameFromElsewhere 
 #               "newBinding"

A nice property is := commutes (in the sense of algebra or category theory) with R’s concatenation function c(). That is the following two statements are equivalent:

c('a', 'b') := c('x', 'y')
 #    a   b 
 #  "x" "y"

c('a' := 'x', 'b' := 'y')
 #    a   b 
 #  "x" "y"

The named map builder is designed to synergize with seplyr.

%?% (coalesce)

The coalesce operator tries to replace elements of its first argument with elements from its second argument. In particular %?% replaces NULL vectors and NULL/NA entries of vectors and lists.

Example:

c(1, NA) %?% list(NA, 20)
 #  [1]  1 20

%.|% (reduce/expand args)

x %.|% f stands for f(x[[1]], x[[2]], ..., x[[length(x)]]). v %|.% x also stands for f(x[[1]], x[[2]], ..., x[[length(x)]]). The two operators are the same, the variation just allowing the user to choose the order they write things. The mnemonic is: β€œdata goes on the dot-side of the operator.”

args <- list('prefix_', c(1:3), '_suffix')

args %.|% paste0
 #  [1] "prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"
# prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"

paste0 %|.% args
 #  [1] "prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"
# prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"

DebugFnW()

DebugFnW() wraps a function for debugging. If the function throws an exception the execution context (function arguments, function name, and more) is captured and stored for the user. The function call can then be reconstituted, inspected and even re-run with a step-debugger. Please see our free debugging video series and vignette('DebugFnW', package='wrapr') for examples.

Ξ»() (anonymous function builder)

Ξ»() is a concise abstract function creator or β€œlambda abstraction”. It is a placeholder that allows the use of the -character for very concise function abstraction.

Example:

# Make sure lambda function builder is in our enironment.
wrapr::defineLambda()

# square numbers 1 through 4
sapply(1:4, Ξ»(x, x^2))
 #  [1]  1  4  9 16

let()

let() allows execution of arbitrary code with substituted variable names (note this is subtly different than binding values for names as with base::substitute() or base::with()).

The function is simple and powerful. It treats strings as variable names and re-writes expressions as if you had used the denoted variables. For example the following block of code is equivalent to having written β€œa + a”.

a <- 7

let(
  c(VAR = 'a'),
  
  VAR + VAR
)
 #  [1] 14

This is useful in re-adapting non-standard evaluation interfaces (NSE interfaces) so one can script or program over them.

We are trying to make let() self teaching and self documenting (to the extent that makes sense). For example try the arguments β€œeval=FALSE” prevent execution and see what would have been executed, or debug=TRUE to have the replaced code printed in addition to being executed:

let(
  c(VAR = 'a'),
  eval = FALSE,
  {
    VAR + VAR
  }
)
 #  {
 #      a + a
 #  }

let(
  c(VAR = 'a'),
  debugPrint = TRUE,
  {
    VAR + VAR
  }
)
 #  $VAR
 #  [1] "a"
 #  
 #  {
 #      a + a
 #  }
 #  [1] 14

Please see vignette('let', package='wrapr') for more examples. Some formal documentation can be found here. wrapr::let() was inspired by gtools::strmacro() and base::bquote(), please see here for some notes on macro methods in R.

evalb()/si() (evaluate with bquote / string interpolation)

wrapr supplies unified notation for quasi-quotation and string interpolation.

angle = 1:10
variable <- "angle"

# execute code
evalb(
  plot(x = .(-variable), y = sin(.(-variable)))
)

# alter string
si("plot(x = .(variable), y = .(variable))")
 #  [1] "plot(x = \"angle\", y = \"angle\")"

The extra .(-x) form is a shortcut for .(as.name(x)).

sortv() (sort a data.frame by a set of columns)

This is the sort command that is missing from R: sort a data.frame by a chosen set of columns specified in a variable.

d <- data.frame(
  x = c(2, 2, 3, 3, 1, 1), 
  y = 6:1,
  z = 1:6)
order_cols <- c('x', 'y')

sortv(d, order_cols)
 #    x y z
 #  6 1 1 6
 #  5 1 2 5
 #  2 2 5 2
 #  1 2 6 1
 #  4 3 3 4
 #  3 3 4 3

Installation

Install with:

install.packages("wrapr")

More Information

More details on wrapr capabilities can be found in the following two technical articles:

Note

Note: wrapr is meant only for β€œtame names”, that is: variables and column names that are also valid simple (without quotes) R variables names.

More Repositories

1

zmPDSwR

Example R scripts and data for "Practical Data Science with R" 1st edition by Nina Zumel and John Mount (Manning Publications)
HTML
477
star
2

vtreat

vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under choice of GPL-2 or GPL-3 license.
HTML
283
star
3

Examples

Various examples for different articles
HTML
152
star
4

PDSwR2

Code, Data, and Examples for Practical Data Science with R 2nd edition (Nina Zumel and John Mount) https://github.com/WinVector/PDSwR2
HTML
130
star
5

pyvtreat

vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Python
119
star
6

data_algebra

Codd method-chained SQL generator and Pandas data processing in Python.
Python
114
star
7

rquery

Data Wrangling and Query Generating Operators for R. Distributed under choice of GPL-2 or GPL-3 license.
HTML
109
star
8

WVPlots

Pre-packaged plots in R
R
84
star
9

replyr

Patches for using dplyr with Databases and Big Data
HTML
67
star
10

BigDataRStrata2017

All material for "Modeling big data with R, sparklyr, and Apache Spark" Strata Hadoop 2017.
HTML
63
star
11

seplyr

Improved Standard Evaluation Interfaces for Common Data Manipulation Tasks
R
49
star
12

cdata

Higher order fluid or coordinatized data transforms in R. Distributed under choice of GPL-2 or GPL-3 license.
R
44
star
13

CampaignPlanner

Example code for Lesson on Response Campaign planning
HTML
38
star
14

rqdatatable

Implement the rquery piped query algebra in R using data.table. Distributed under choice of GPL-2 or GPL-3 license.
R
37
star
15

Logistic

Experimental logistic regression code supporting multiple result categories, many levels of categorical modeling variables, good optimization, L2 regularization and more.
Java
35
star
16

AutoDiff

Example automatic differentiation code in Scala
Scala
30
star
17

sigr

Concise formatting of significances in R (GPL3 license).
HTML
27
star
18

ExploreModels

Code and data for "The Geometry of Classifiers"
R
26
star
19

WinVector.github.io

Viewable pages from WinVector LLC view at: http://winvector.github.io
HTML
23
star
20

NestedModelsTalk

Support materials for WinVector talk
19
star
21

CampaignPlanner_v3

Shiny demo of A/B test planning and evaluation (improved UI for A/B testing method taught in free video course)
R
17
star
22

WVLPSolver

Experimental pure Java revised simplex linear program solver (Apache 2.0 license)
Java
15
star
23

Locality-Sensitive-Hashing-Example

Simple example of Locality Sensitive Hashing
Java
14
star
24

RcppDynProg

Dynamic Programming implemented in Rcpp. Includes example partition and out of sample fitting applications.
C++
14
star
25

ODSCWest2017

Win-Vector LLC ODSC West 2017 presentation materials (will be populated by the day of the conference)
HTML
14
star
26

kcomp

Demonstration of parametric bootstrap to find k for kmeans
HTML
10
star
27

ValidatingModelsInR

Slides and code for "Validating Models in R" Strata 2016 RDay http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/48053
HTML
10
star
28

SQLScrewdriver

Iterate through database tables (by JDBC) and TSV(tab separated values)/CSV(comma separated values) and load/dump data.
Java
8
star
29

wvpy

Tools to convert from Jupyter notebooks to and from Python .py files, and render.
HTML
8
star
30

FastBaseR

Examples of fast grouped row-wise operations in R (no C, C++, data.table, or dplyr used).
R
6
star
31

VectorDemo

Tutorial on using vectors in data science projects.
Jupyter Notebook
3
star
32

OutOfCore

Example of out of core coding techniques
Java
2
star
33

Importance-Sampling

Importance Sampling Example
Java
2
star
34

ExampleRPackage

Example of how to build a simple R package
R
2
star
35

ClassifierMetrics

Some examples of measuring classifier performance in R
HTML
2
star
36

QSurvival

Quasi observation based survival package for R.
R
2
star
37

LStep

Trivial demonstration of a diverging Newton-Raphson step when solving a logistic regression
Java
2
star
38

JXREF

Java based XML tool to help check Manning Agile Author XML for cross reference problems (Java based, GPL3+ license)
Java
1
star
39

ExperimentInspector

Java code to build synthetic data sets that match reported summary totals. Helps explore possible range of variation.
Java
1
star
40

crosspca

Cross-validated PCA/PCR demonstration based on the work: http://www.win-vector.com/blog/2016/05/pcr_part2_yaware/
R
1
star
41

SessionExample

Example code for articles on sessionizing data.
1
star
42

daccum

Example library to accumulate data frame rows in R
R
1
star
43

YConditionalRegularizedModel

Example of a neural net model, with regularization on y-conditional activation patterns
Jupyter Notebook
1
star
44

ATasteOfDataScience

Working an example of supervised machine learning in Python
Jupyter Notebook
1
star
45

CVRTSEncoder

Spectral encoding of categorical variables using model residual trajectories
R
1
star
46

wvu

Win Vector LLC Python data science teaching tools (graphs and data manipulation)
HTML
1
star
47

BreakingNestedModelBias

Support materials for Win-Vector blog article
HTML
1
star