• Stars
    star
    159
  • Rank 235,916 (Top 5 %)
  • Language
    R
  • Created about 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Support Types for Variables, Arguments, and Return Values

Travis build status Codecov test coverage

typed

{typed} implements a type system for R, it has 3 main features:

  • set variable types in a script or the body of a function, so they can’t be assigned illegal values
  • set argument types in a function definition
  • set return type of a function

The user can define their own types, or leverage assertions from other packages.

Under the hood variable types use active bindings, so once a variable is restricted by an assertion, it cannot be modified in a way that would not satisfy it.

Installation

Install CRAN version with:

install.packages("typed")

or development version with :

remotes::install_github("moodymudskipper/typed")

And attach with :

# masking warning about overriding `?`
library(typed, warn.conflicts = FALSE) 

Set variable type

Question mark notation and declare

Here are examples on how we would set types

Character() ? x # restrict x to "character" type
x <- "a"
x
#> [1] "a"

Integer(3) ? y <- 1:3 # restrict y to "integer" type of length 3
y
#> [1] 1 2 3

We cannot assign values of the wrong type to x and y anymore.

x <- 2
#> Error: type mismatch
#> `typeof(value)`: "double"   
#> `expected`:      "character"

y <- 4:5
#> Error: length mismatch
#> `length(value)`: 2
#>      `expected`: 3

But the right type will work.

x <- c("b", "c")

y <- c(1L, 10L, 100L)

declare is a strict equivalent, slightly more efficient, which looks like base::assign.

declare("x", Character())
x <- "a"
x
#> [1] "a"

declare("y", Integer(3), 1:3)
y
#> [1] 1 2 3

Assertion factories and assertions

Integer and Character are function factories (functions that return functions), thus Integer(3) and Character() are functions.

The latter functions operate checks on a value and in case of success return this value, generally unmodified. For instance :

Integer(3)(1:2)
#> Error: length mismatch
#> `length(value)`: 2
#>      `expected`: 3

Character()(3)
#> Error: type mismatch
#> `typeof(value)`: "double"   
#> `expected`:      "character"

We call Integer(3) and Character() assertions, and we call Integer and Character assertion factories.

The package contains many assertion factories (see ?assertion_factories), the main ones are:

  • Any (No default restriction)
  • Logical
  • Integer
  • Double
  • Character
  • List
  • Environment
  • Factor
  • Matrix
  • Data.frame
  • Date
  • Time (POSIXct)

Custom assertions

As we’ve seen with Integer(3), passing arguments to a assertion factory restricts the type.

For instance Integer has arguments length null_ok and ..., we already used length, null_ok is convenient to allow a default NULL value in addition to the "integer" type. In the dots we can use arguments named as functions and with the value of the expected result.

Integer(anyNA = FALSE) ? x <- c(1L, 2L, NA)
#> Error: `anyNA` mismatch
#> `anyNA(value)`: TRUE 
#> `expected`:     FALSE

Useful arguments might be for instance, anyDuplicated = 0L, names = NULL, attributes = NULL… Any available function can be used.

That makes assertion factories very flexible! If it is still not flexible enough, one can provide conditions using formulas in the .... Be careful to skip all named arguments by adding comas, or name the formula arguments ....

fruit <- Character(1, ... = "`value` is not a fruit!" ~ . %in% c("apple", "pear", "cherry"))

fruit ? x <- "potatoe"
#> Error: `value` is not a fruit!
#> `value %in% c("apple", "pear", "cherry")`: FALSE
#> `expected`:                                TRUE

The arguments can differ between assertion factories, for instance Data.frame has nrow, ncol, each, null_ok and ...

Data.frame() ? x <- iris
Data.frame(ncol = 2) ? x <- iris
#> Error: Column number mismatch
#> `ncol(value)`: 5
#>    `expected`: 2
Data.frame(each = Double()) ? x <- iris
#> Error: column 5 ("Species") type mismatch
#> `typeof(value)`: "integer"
#> `expected`:      "double"

Leverage assertions from other packages, build your own assertion factories

Some great packages provide assertions, and they can be used with typed provided that they take the object as a first input and return the object if no failure. Richie Cotton’s {assertive} and Michel Lang’s {checkmate} both qualify.

library(assertive)
assert_is_monotonic_increasing ? z
z <- 3:1
#> Error: is_monotonic_increasing : The values of assigned_value are not monotonic increasing.
#>   Position ValueBefore ValueAfter
#> 1      1/2           3          2
#> 2      2/3           2          1

If we want to use more than the first argument, we should create an assertion factory :

Monotonic_incr <- as_assertion_factory(assert_is_monotonic_increasing)
Monotonic_incr(strictly = TRUE) ? z
z <- c(1, 1, 2)
#> Error: is_monotonic_increasing : The values of value are not strictly monotonic increasing.
#>   Position ValueBefore ValueAfter
#> 1      1/2           1          1

as_assertion_factory can be used to create your own assertion factories from scratch too, in fact it’s used to build the native assertion factories of this package .

Constants

To define a constant, we just surround the variable by parentheses (think of them as a protection)

Double() ? (x) <- 1
x <- 2
#> Error: Can't assign to a constant

? (y) <- 1
y <- 2
#> Error: Can't assign to a constant

Set argument type

We can set argument types this way :

add <- ? function (x= ? Double(), y= 1 ? Double()) {
  x + y
}

Note that we started the definition with a ?, and that we gave a default to y, but not x. Note also the = sign next to x, necessary even when we have no default value. If you forget it you’ll have an error β€œunexpected ? in …”.

This created the following function, by adding checks at the top of the body

add
#> # typed function
#> function (x, y = 1) 
#> {
#>     check_arg(x, Double())
#>     check_arg(y, Double())
#>     x + y
#> }
#> # Arg types:
#> # x: Double()
#> # y: Double()

Let’s test it by providing a right and wrong type.

add(2, 3)
#> [1] 5
add(2, 3L)
#> Error: In `add(2, 3L)` at `check_arg(y, Double())`:
#> wrong argument to function, type mismatch
#> `typeof(value)`: "integer"
#> `expected`:      "double"

If we want to restrict x and y to the type β€œinteger” in the rest of the body of the function we can use the ?+ notation :

add <- ? function (x= ?+ Double(), y= 1 ?+ Double()) {
  x + y
}

add
#> # typed function
#> function (x, y = 1) 
#> {
#>     check_arg(x, Double(), .bind = TRUE)
#>     check_arg(y, Double(), .bind = TRUE)
#>     x + y
#> }
#> # Arg types:
#> # x: Double()
#> # y: Double()

We see that it is translated into a check_arg call containing a .bind = TRUE argument.

I we want to restrict the quoted expression rather than the value of an argument, we can use ?~ :

identity_sym_only <- ? function (x= ?~ Symbol()) {
  x
}

a <- 1
identity_sym_only(a)
#> [1] 1
identity_sym_only(a + a)
#> Error: In `identity_sym_only(a + a)` at `check_arg(substitute(x), Symbol())`:
#> wrong argument to function, type mismatch
#> `typeof(value)`: "language"
#> `expected`:      "symbol"

identity_sym_only
#> # typed function
#> function (x) 
#> {
#>     check_arg(substitute(x), Symbol())
#>     x
#> }
#> <bytecode: 0x000000001cb34218>
#> # Arg types:
#> # x: ~Symbol()

We see that it is translated into a check_arg call containing a call to substitute as the first argument. The ~ is kept in the attributes of the function.

We can also check the ..., for instance use function(... = ? Integer()) to check that only integers are passed to the dots, and use function(... = ?~ Symbol()) to check that all quoted values passed to ... are symbols.

The special assertion factory Dots can also be used, in that case the checks will apply to list(...) rather than to each element individually, for instance function(... = ? Dots(2)) makes sure the dots were fed 2 values. In a similar fashion function(... = ?~ Dots(2)) can be used to apply checks to the list of quoted argument passed to ....

Set function return type

To set a return type we use ? before the function definition as in the previous section, but we type an assertion on the left hand side.

add_or_subtract <- Double() ? function (x, y, subtract = FALSE) {
  if(subtract) return(x - y)
  x + y
}
add_or_subtract
#> # typed function
#> function (x, y, subtract = FALSE) 
#> {
#>     if (subtract) 
#>         return(check_output(x - y, Double()))
#>     check_output(x + y, Double())
#> }
#> # Return type: Double()

We see that the returned values have been wrapped inside check_output calls.

Putting it all together, write packages using {typed}

Let’s define our function for our package and document it with {roxygen2}. It is documented as usual,except that you’ll need to make sure to add the @name tag.

We declare types for the return value, for all arguments, and we declare a string msg.

#' add_or_subtract
#'
#' @param x double of length 1
#' @param y double of length 1
#' @param subtract whether to subtract instead of adding
#' @export
#' @name add_or_subtract
add_or_subtract <- 
  Double(1) ? function (
    x= ? Double(1), 
    y= ? Double(1), 
    subtract = FALSE ? Logical(1, anyNA = FALSE)
    ) {
    Character(1) ? msg
    if(subtract) {
      msg <- "subtracting"
      message(msg)
      return(x - y)
    }
      msg <- "adding"
      message(msg)
    x + y
  }

The created function will be the following, we see that Character(1) ? msg was changed into a declare call too, this is both for efficiency and readability. Unfamiliar users might be intimidated by ? and calls to ? don’t print nicely.

add_or_subtract
#> # typed function
#> function (x, y, subtract = FALSE) 
#> {
#>     check_arg(x, Double(1))
#>     check_arg(y, Double(1))
#>     check_arg(subtract, Logical(1, anyNA = FALSE))
#>     declare("msg", Character(1))
#>     if (subtract) {
#>         msg <- "subtracting"
#>         message(msg)
#>         return(check_output(x - y, Double(1)))
#>     }
#>     msg <- "adding"
#>     message(msg)
#>     check_output(x + y, Double(1))
#> }
#> # Return type: Double(1)
#> # Arg types:
#> # x: Double(1)
#> # y: Double(1)
#> # subtract: Logical(1, anyNA = FALSE)

Note that your package would import {typed} but ? won’t be exposed to the user, they will see it in the code but will be able to use ? just as before. In fact the most common standard use ?mean still works even when {typed} is attached.

Acknowledgements

This is inspired in good part by Jim Hester and Gabor Csardi’s work and many great efforts on static typing, assertions, or annotations in R, in particular:

  • Gabor Csardy’s {argufy}
  • Richie Cotton’s {assertive}
  • Tony Fishettti’s {assertr}
  • Hadley Wickham’s {assertthat}
  • Michel Lang’s {checkmate}
  • Joe Thorley’s {checkr}
  • Joe Thorley’s {chk}
  • Aviral Goel’s {contractr}
  • Stefan Bache’s {ensurer}
  • Brian Lee Yung Rowe’s {lambda.r}
  • Kun Ren’s {rtype}
  • Jim Hester’s {types}

More Repositories

1

flow

View and Browse Code Using Flow Diagrams
R
397
star
2

unglue

Extract matched substrings using a pattern, similar to what package glue does in reverse
R
158
star
3

boomer

Debugging Tools to Inspect the Intermediate Steps of a Call
R
134
star
4

powerjoin

Extensions of 'dplyr' and 'fuzzyjoin' Join Functions
R
99
star
5

fastpipe

A fast pipe implementation
R
85
star
6

nakedpipe

Pipe Into a Sequence of Calls Without Repeating the Pipe Symbol.
R
69
star
7

burglr

Copy Functions from Other Packages Without Adding Them As Dependencies
R
58
star
8

refactor

Tools for Refactoring Code
R
56
star
9

safejoin

Wrappers around dplyr functions to join safely using various checks
R
42
star
10

opt

Set Options Conveniently
R
40
star
11

reactibble

Use Dynamic Columns in Data Frames
R
40
star
12

inops

Infix Operators for Detection, Subsetting and Replacement
R
40
star
13

myverse

Easily Load a Set of Packages
R
26
star
14

boom

Print the Output of Intermediate Steps of a Call
R
23
star
15

devtag

Restrict Help Files to Development
R
20
star
16

pipediff

Show Diffs Between Piped Steps
R
20
star
17

doubt

Enable operators containing the '?' symbol
R
18
star
18

dotdot

Enhanced assignments. Use `..` on the right hand side as a shorthand for the left hand side.
R
17
star
19

qplyr

Delayed Evaluation With tidyverse Verbs
R
16
star
20

elephant

make variables remember their history
R
15
star
21

tricks

An Addin to Easily Program and Trigger Actions
R
14
star
22

tibbleprint

Print Data Frames Like Tibbles
R
14
star
23

ggframe

data frames that print as ggplots
R
14
star
24

tag

Build function operator factories supporting the tag$function(args) notation
R
13
star
25

editor

Edit scripts programatically
R
13
star
26

datasearch

Find Datasets Observing Specific Conditions
R
13
star
27

once

A Collection of Single Use Function Operators
R
11
star
28

pkg

Package Objects
R
10
star
29

ask

ask R anything
R
10
star
30

intercept

Intercept Messages and Warnings Based on Class, Package or Regular Expression
R
10
star
31

blame

Semantic Version Control for R
R
9
star
32

recycle

Set Hook on Garbage Collection
R
9
star
33

ggfail

A Quick And Dirty Package to Make Wrong ggplot Calls Fail
R
8
star
34

cutr

Enhanced cut And Useful Related Functions
R
8
star
35

tags

A collection of tags built using the package tag
R
8
star
36

now

Remove Exported Functions Depending On Lifecycle
R
7
star
37

liblog

Log Calls to loadNamespace
R
7
star
38

woof

wadlo's companion package
R
7
star
39

ggdollar

Use nested lists of functions to set ggplot theme attributes intuitively
R
7
star
40

shootnloot

Easily share objects between remote sessions
R
7
star
41

goto

What the Package Does (One Line, Title Case)
R
6
star
42

midi

What the Package Does (Title Case)
R
6
star
43

mmpipe

Not maintained, use *pipes* instead : https://github.com/moodymudskipper/pipes which has a cleaner implementation (and a few differences)
R
6
star
44

shinycheck

Check shiny Code
R
5
star
45

loop

Alternatives to apply Functions
R
5
star
46

withDT

Use data.table Syntax For One Call
R
4
star
47

replace

Replace Variable Names in R Scripts
R
4
star
48

ghstudio

Experimental tools to use git/github with RStudio
R
4
star
49

dot3

Tools to Manipulate the Ellipsis Object
R
3
star
50

devtag.example

An example using 'devtag'
R
3
star
51

tidygm

Music as Tidy Data Frames
R
3
star
52

flat

Flatten package to script you can source to recover the package
R
2
star
53

adventofcode2021

My Solutions for Advent Of Code 2021
R
2
star
54

github.traffic

What the Package Does (One Line, Title Case)
R
2
star
55

bigbrothr

Provide Automated Feedback to Package Maintainers on the usage of their package.
R
2
star
56

tabs

Extends rstudioapi
R
1
star
57

debugverse

Brainstorming ideas for debugging workflow and tools, not a package (yet ?)
1
star
58

docalltest

Some alternative to do.call and a comparison
R
1
star
59

check

Readable Assertions
R
1
star
60

flexaddins

What the Package Does (One Line, Title Case)
1
star
61

debugonce

Rstudio Addin to debugonce without typing
R
1
star
62

adventofcode2020

My Solutions for Advent Of Code 2020
R
1
star
63

tidyversediagrams

What the Package Does (One Line, Title Case)
R
1
star
64

private

private closures for closures
R
1
star
65

selfbm

Benchmark a Function against Itself
R
1
star
66

pbfor

RETIRED, use {once} instead! https://github.com/moodymudskipper/once
R
1
star
67

poof.tricks

What the Package Does (One Line, Title Case)
R
1
star
68

frankenply

Avoid Using Functionals by Prefixing your Arguments Directly in the Function Call
R
1
star
69

realquick

One line object summaries
1
star
70

pivot

Pivot Inside 'summarize' Calls
R
1
star
71

alt.doc

Alternative help files.
R
1
star
72

easydb

DBI and dplyr wrappers to write to DB, fetch and run data manipulation operations on server side.
R
1
star