• Stars
    star
    229
  • Rank 173,380 (Top 4 %)
  • Language
    R
  • License
    Other
  • Created about 8 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Execute and Control Subprocesses from R

processx

Execute and Control System Processes

lifecycle R-CMD-check CRAN RStudio mirror downloads Codecov test coverage

Tools to run system processes in the background, read their standard output and error and kill them.

processx can poll the standard output and error of a single process, or multiple processes, using the operating systemโ€™s polling and waiting facilities, with a timeout.


Features

  • Start system processes in the background and find their process id.
  • Read the standard output and error, using non-blocking connections
  • Poll the standard output and error connections of a single process or multiple processes.
  • Write to the standard input of background processes.
  • Check if a background process is running.
  • Wait on a background process, or multiple processes, with a timeout.
  • Get the exit status of a background process, if it has already finished.
  • Kill background processes.
  • Kill background process, when its associated object is garbage collected.
  • Kill background processes and all their child processes.
  • Works on Linux, macOS and Windows.
  • Lightweight, it only depends on the also lightweight R6 and ps packages.

Installation

Install the stable version from CRAN:

install.packages("processx")

Usage

library(processx)

Note: the following external commands are usually present in macOS and Linux systems, but not necessarily on Windows. We will also use the px command line tool (px.exe on Windows), that is a very simple program that can produce output to stdout and stderr, with the specified timings.

px <- paste0(
  system.file(package = "processx", "bin", "px"),
  system.file(package = "processx", "bin", .Platform$r_arch, "px.exe")
)
px
#> [1] "/Users/gaborcsardi/Library/R/arm64/4.2/library/processx/bin/px"

Running an external process

The run() function runs an external command. It requires a single command, and a character vector of arguments. You donโ€™t need to quote the command or the arguments, as they are passed directly to the operating system, without an intermediate shell.

run("echo", "Hello R!")
#> $status
#> [1] 0
#> 
#> $stdout
#> [1] "Hello R!\n"
#> 
#> $stderr
#> [1] ""
#> 
#> $timeout
#> [1] FALSE

Short summary of the px binary we are using extensively below:

result <- run(px, "--help", echo = TRUE)
#> Usage: px [command arg] [command arg] ...
#> 
#> Commands:
#>   sleep  <seconds>           -- sleep for a number os seconds
#>   out    <string>            -- print string to stdout
#>   err    <string>            -- print string to stderr
#>   outln  <string>            -- print string to stdout, add newline
#>   errln  <string>            -- print string to stderr, add newline
#>   errflush                   -- flush stderr stream
#>   cat    <filename>          -- print file to stdout
#>   return <exitcode>          -- return with exitcode
#>   writefile <path> <string>  -- write to file
#>   write <fd> <string>        -- write to file descriptor
#>   echo <fd1> <fd2> <nbytes>  -- echo from fd to another fd
#>   getenv <var>               -- environment variable to stdout

Note: From version 3.0.1, processx does not let you specify a full shell command line, as this involves starting a grandchild process from the child process, and it is difficult to clean up the grandchild process when the child process is killed. The user can still start a shell (sh or cmd.exe) directly of course, and then proper cleanup is the userโ€™s responsibility.

Errors

By default run() throws an error if the process exits with a non-zero status code. To avoid this, specify error_on_status = FALSE:

run(px, c("out", "oh no!", "return", "2"), error_on_status = FALSE)
#> $status
#> [1] 2
#> 
#> $stdout
#> [1] "oh no!"
#> 
#> $stderr
#> [1] ""
#> 
#> $timeout
#> [1] FALSE

Showing output

To show the output of the process on the screen, use the echo argument. Note that the order of stdout and stderr lines may be incorrect, because they are coming from two different connections.

result <- run(px,
  c("outln", "out", "errln", "err", "outln", "out again"),
  echo = TRUE)
#> out
#> out again
#> err

If you have a terminal that support ANSI colors, then the standard error output is shown in red.

The standard output and error are still included in the result of the run() call:

result
#> $status
#> [1] 0
#> 
#> $stdout
#> [1] "out\nout again\n"
#> 
#> $stderr
#> [1] "err\n"
#> 
#> $timeout
#> [1] FALSE

Note that run() is different from system(), and it always shows the output of the process on Rโ€™s proper standard output, instead of writing to the terminal directly. This means for example that you can capture the output with capture.output() or use sink(), etc.:

out1 <- capture.output(r1 <- system("ls"))
out2 <- capture.output(r2 <- run("ls", echo = TRUE))
out1
#> character(0)
out2
#>  [1] "CODE_OF_CONDUCT.md" "DESCRIPTION"        "LICENSE"           
#>  [4] "LICENSE.md"         "Makefile"           "NAMESPACE"         
#>  [7] "NEWS.md"            "R"                  "README.Rmd"        
#> [10] "README.md"          "_pkgdown.yml"       "codecov.yml"       
#> [13] "inst"               "man"                "processx.Rproj"    
#> [16] "src"                "tests"

Spinner

The spinner option of run() puts a calming spinner to the terminal while the background program is running. The spinner is always shown in the first character of the last line, so you can make it work nicely with the regular output of the background process if you like. E.g. try this in your R terminal:

result <- run(px,
  c("out", "  foo",
    "sleep", "1",
    "out", "\r  bar",
    "sleep", "1",
    "out", "\rX foobar\n"),
  echo = TRUE, spinner = TRUE)

Callbacks for I/O

run() can call an R function for each line of the standard output or error of the process, just supply the stdout_line_callback or the stderr_line_callback arguments. The callback functions take two arguments, the first one is a character scalar, the output line. The second one is the process object that represents the background process. (See more below about process objects.) You can manipulate this object in the callback, if you want. For example you can kill it in response to an error or some text on the standard output:

cb <- function(line, proc) {
  cat("Got:", line, "\n")
  if (line == "done") proc$kill()
}
result <- run(px,
  c("outln", "this", "outln", "that", "outln", "done",
    "outln", "still here", "sleep", "10", "outln", "dead by now"), 
  stdout_line_callback = cb,
  error_on_status = FALSE,
)
#> Got: this 
#> Got: that 
#> Got: done 
#> Got: still here
result
#> $status
#> [1] -9
#> 
#> $stdout
#> [1] "this\nthat\ndone\nstill here\n"
#> 
#> $stderr
#> [1] ""
#> 
#> $timeout
#> [1] FALSE

Keep in mind, that while the R callback is running, the background process is not stopped, it is also running. In the previous example, whether still here is printed or not depends on the scheduling of the R process and the background process by the OS. Typically, it is printed, because the R callback takes a while to run.

In addition to the line-oriented callbacks, the stdout_callback and stderr_callback arguments can specify callback functions that are called with output chunks instead of single lines. A chunk may contain multiple lines (separated by \n or \r\n), or even incomplete lines.

Managing external processes

If you need better control over possibly multiple background processes, then you can use the R6 process class directly.

Starting processes

To start a new background process, create a new instance of the process class.

p <- process$new("sleep", "20")

Killing a process

A process can be killed via the kill() method.

p$is_alive()
#> [1] TRUE
p$kill()
#> [1] TRUE
p$is_alive()
#> [1] FALSE

Note that processes are finalized (and killed) automatically if the corresponding process object goes out of scope, as soon as the object is garbage collected by R:

p <- process$new("sleep", "20")
rm(p)
invisible(gc())

Here, the direct call to the garbage collector kills the sleep process as well. See the cleanup option if you want to avoid this behavior.

Standard output and error

By default the standard output and error of the processes are ignored. You can set the stdout and stderr constructor arguments to a file name, and then they are redirected there, or to "|", and then processx creates connections to them. (Note that starting from processx 3.0.0 these connections are not regular R connections, because the public R connection API was retroactively removed from R.)

The read_output_lines() and read_error_lines() methods can be used to read complete lines from the standard output or error connections. They work similarly to the readLines() base R function.

Note, that the connections have a buffer, which can fill up, if R does not read out the output, and then the process will stop, until R reads the connection and the buffer is freed.

Always make sure that you read out the standard output and/or error of the pipes, otherwise the background process will stop running!

If you donโ€™t need the standard output or error any more, you can also close it, like this:

close(p$get_output_connection())
close(p$get_error_connection())

Note that the connections used for reading the output and error streams are non-blocking, so the read functions will return immediately, even if there is no text to read from them. If you want to make sure that there is data available to read, you need to poll, see below.

p <- process$new(px,
  c("sleep", "1", "outln", "foo", "errln", "bar", "outln", "foobar"),
  stdout = "|", stderr = "|")
p$read_output_lines()
#> character(0)
p$read_error_lines()
#> character(0)

End of output

The standard R way to query the end of the stream for a non-blocking connection, is to use the isIncomplete() function. After a read attempt, this function returns FALSE if the connection has surely no more data. (If the read attempt returns no data, but isIncomplete() returns TRUE, then the connection might deliver more data in the future.

The is_incomplete_output() and is_incomplete_error() functions work similarly for process objects.

Polling the standard output and error

The poll_io() method waits for data on the standard output and/or error of a process. It will return if any of the following events happen:

  • data is available on the standard output of the process (assuming there is a connection to the standard output).
  • data is available on the standard error of the process (assuming the is a connection to the standard error).
  • The process has finished and the standard output and/or error connections were closed on the other end.
  • The specified timeout period expired.

For example the following code waits about a second for output.

p <- process$new(px, c("sleep", "1", "outln", "kuku"), stdout = "|")

## No output yet
p$read_output_lines()
#> character(0)
## Wait at most 5 sec
p$poll_io(5000)
#>   output    error  process 
#>  "ready" "nopipe" "nopipe"
## There is output now
p$read_output_lines()
#> [1] "kuku"

Polling multiple processes

If you need to manage multiple background processes, and need to wait for output from all of them, processx defines a poll() function that does just that. It is similar to the poll_io() method, but it takes multiple process objects, and returns as soon as one of them have data on standard output or error, or a timeout expires. Here is an example:

p1 <- process$new(px, c("sleep", "1", "outln", "output"), stdout = "|")
p2 <- process$new(px, c("sleep", "2", "errln", "error"), stderr = "|")

## After 100ms no output yet
poll(list(p1 = p1, p2 = p2), 100)
#> $p1
#>    output     error   process 
#> "timeout"  "nopipe"  "nopipe" 
#> 
#> $p2
#>    output     error   process 
#>  "nopipe" "timeout"  "nopipe"
## But now we surely have something
poll(list(p1 = p1, p2 = p2), 1000)
#> $p1
#>   output    error  process 
#>  "ready" "nopipe" "nopipe" 
#> 
#> $p2
#>   output    error  process 
#> "nopipe" "silent" "nopipe"
p1$read_output_lines()
#> [1] "output"
## Done with p1
close(p1$get_output_connection())
#> NULL
## The second process should have data on stderr soonish
poll(list(p1 = p1, p2 = p2), 5000)
#> $p1
#>   output    error  process 
#> "closed" "nopipe" "nopipe" 
#> 
#> $p2
#>   output    error  process 
#> "nopipe"  "ready" "nopipe"
p2$read_error_lines()
#> [1] "error"

Waiting on a process

As seen before, is_alive() checks if a process is running. The wait() method can be used to wait until it has finished (or a specified timeout expires).. E.g. in the following code wait() needs to wait about 2 seconds for the sleep px command to finish.

p <- process$new(px, c("sleep", "2"))
p$is_alive()
#> [1] TRUE
Sys.time()
#> [1] "2022-06-10 13:57:49 CEST"
p$wait()
Sys.time()
#> [1] "2022-06-10 13:57:51 CEST"

It is safe to call wait() multiple times:

p$wait() # already finished!

Exit statuses

After a process has finished, its exit status can be queried via the get_exit_status() method. If the process is still running, then this method returns NULL.

p <- process$new(px, c("sleep", "2"))
p$get_exit_status()
#> NULL
p$wait()
p$get_exit_status()
#> [1] 0

Mixing processx and the parallel base R package

In general, mixing processx (via callr or not) and parallel works fine. If you use parallelโ€™s โ€˜forkโ€™ clusters, e.g.ย via parallel::mcparallel(), then you might see two issues. One is that processx will not be able to determine the exit status of some processx processes. This is because the status is read out by parallel, and processx will set it to NA. The other one is that parallel might complain that it could not clean up some subprocesses. This is not an error, and it is harmless, but it does hold up R for about 10 seconds, before parallel gives up. To work around this, you can set the PROCESSX_NOTIFY_OLD_SIGCHLD environment variable to a non-empty value, before you load processx. This behavior might be the default in the future.

Errors

Errors are typically signalled via non-zero exits statuses. The processx constructor fails if the external program cannot be started, but it does not deal with errors that happen after the program has successfully started running.

p <- process$new("nonexistant-command-for-sure")
#> Error in c("process_initialize(self, private, command, args, stdin, stdout, ", : ! Native call to `processx_exec` failed
#> Caused by error in `chain_call(c_processx_exec, command, c(command, args), pty, pty_options, โ€ฆ` at initialize.R:138:3:
#> ! cannot start processx process 'nonexistant-command-for-sure' (system error 2, No such file or directory) @unix/processx.c:613 (processx_exec)
p2 <- process$new(px, c("sleep", "1", "command-does-not-exist"))
p2$wait()
p2$get_exit_status()
#> [1] 5

Related tools

  • The ps package can query, list, manipulate all system processes (not just subprocesses), and processx uses it internally for some of its functionality. You can also convert a processx::process object to a ps::ps_handle with the as_ps_handle() method.

  • The callr package uses processx to start another R process, and run R code in it, in the foreground or background.

Code of Conduct

Please note that the processx project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

MIT ยฉ Mango Solutions, RStudio, Gรกbor Csรกrdi

More Repositories

1

devtools

Tools to make an R developer's life easier
R
2,379
star
2

lintr

Static Code Analysis for R
R
1,181
star
3

httr

httr: a friendly http package for R
R
984
star
4

actions

GitHub Actions for the R community
TypeScript
948
star
5

testthat

An R ๐Ÿ“ฆ to make testing ๐Ÿ˜€
R
875
star
6

usethis

Set up commonly used ๐Ÿ“ฆ components
R
842
star
7

pkgdown

Generate static html documentation for an R package
R
712
star
8

styler

Non-invasive pretty printing of R code
R
706
star
9

pak

A fresh approach to package installation
C
652
star
10

cli

Tools for making beautiful & useful command line interfaces
R
635
star
11

rig

The R Installation Manager
Rust
609
star
12

roxygen2

Generate R package documentation from inline R comments
R
590
star
13

rlang

Low-level API for programming with R
R
498
star
14

progress

Progress bar in your R terminal
R
463
star
15

here

A simpler way to find your files
R
410
star
16

R6

Encapsulated object-oriented programming for R
R
405
star
17

scales

Tools for ggplot2 scales
R
392
star
18

fs

Provide cross platform file operations based on libuv.
C
362
star
19

rex

Friendly regular expressions for R.
R
331
star
20

covr

Test coverage reports for R
R
331
star
21

crayon

๐Ÿ–๏ธ R package for colored terminal output โ€” now superseded by cli
R
325
star
22

remotes

Install R packages from GitHub, GitLab, Bitbucket, git, svn repositories, URLs
R
325
star
23

memoise

Easy memoisation for R
R
315
star
24

lobstr

Understanding complex R objects with tools similar to str()
R
301
star
25

profvis

Visualize R profiling data
JavaScript
297
star
26

callr

Call R from R
R
295
star
27

slider

Sliding Window Functions
R
295
star
28

vctrs

Generic programming with typed R vectors
C
284
star
29

waldo

Find differences between R objects
R
275
star
30

zeallot

Variable assignment with zeal! (or multiple, unpacking, and destructuring assignment in R)
R
253
star
31

conflicted

An alternative conflict resolution strategy for R
R
244
star
32

bench

High Precision Timing of R Expressions
R
241
star
33

httr2

Make HTTP requests and process their responses. A modern reimagining of httr.
R
232
star
34

gmailr

Access the Gmail RESTful API from R.
R
229
star
35

asciicast

Turn R scripts into terminal screencasts
R
224
star
36

xml2

Bindings to libxml2
R
218
star
37

gh

Minimalistic GitHub API client in R
R
218
star
38

cpp11

cpp11 helps you to interact with R objects using C++ code.
C++
194
star
39

keyring

๐Ÿ” Access the system credential store from R
R
191
star
40

vdiffr

Visual regression testing and graphical diffing with testthat
C++
182
star
41

pillar

Format columns with colour
R
179
star
42

svglite

A lightweight svg graphics device for R
C++
179
star
43

ragg

Graphic Devices Based on AGG
C++
172
star
44

withr

Methods For Temporarily Modifying Global State
R
171
star
45

hugodown

Make websites with hugo and RMarkdown
R
166
star
46

ymlthis

write YAML for R Markdown, bookdown, blogdown, and more
R
163
star
47

coro

Coroutines for R
R
153
star
48

rprojroot

Finding files in project subdirectories
R
148
star
49

debugme

Easy and efficient debugging for R packages
R
146
star
50

available

Check if a package name is available to use
R
142
star
51

gert

Simple git client for R
C
142
star
52

archive

R bindings to libarchive, supporting a large variety of archive formats
C++
142
star
53

ellipsis

Tools for Working with ...
R
141
star
54

later

Schedule an R function or formula to run after a specified period of time.
C++
136
star
55

itdepends

R
133
star
56

fastmap

Fast map implementation for R
C++
132
star
57

prettyunits

Pretty, human readable formatting of quantities
JavaScript
131
star
58

rray

Simple Arrays
R
130
star
59

isoband

isoband: An R package to generate contour lines and polygons.
C++
130
star
60

tidyselect

A backend for functions taking tidyverse selections
R
123
star
61

desc

Manipulate DESCRIPTION files
R
121
star
62

evaluate

A version of eval for R that returns more information about what happened
R
118
star
63

gargle

Infrastructure for calling Google APIs from R, including auth
R
114
star
64

rcmdcheck

Run R CMD check from R and collect the results
R
113
star
65

tree-sitter-r

R
106
star
66

prettycode

Syntax highlight R code in the terminal
R
101
star
67

sloop

S language OOP โ›ต๏ธ
R
101
star
68

clock

A Date-Time Library for R
R
100
star
69

mockery

A mocking library for R.
R
99
star
70

revdepcheck

R package reverse dependency checking
R
99
star
71

pkgdepends

R Package Dependency Resolution
R
94
star
72

lifecycle

Manage the life cycle of your exported functions and arguments
R
92
star
73

systemfonts

System Native Font Handling in R
C++
91
star
74

commonmark

High Performance CommonMark and Github Markdown Rendering in R
C
88
star
75

downlit

Syntax Highlighting and Automatic Linking
R
86
star
76

gtable

The layout packages that powers ggplot2
R
86
star
77

askpass

Password Entry for R, Git, and SSH
R
84
star
78

zip

Platform independent zip compression via miniz
C
83
star
79

rappdirs

Find OS-specific directories to store data, caches, and logs. A port of python's AppDirs
R
82
star
80

clisymbols

Unicode symbols for CLI applications, with fallbacks
R
79
star
81

marquee

Markdown Parser and Renderer for R Graphics
C
77
star
82

ps

R package to query, list, manipulate system processes
C
73
star
83

credentials

Tools for Managing SSH and Git Credentials
R
72
star
84

sessioninfo

Print Session Information
R
72
star
85

pkgapi

Create a map of functions for an R package - WORK IN PROGRESS!
R
70
star
86

sodium

R bindings to libsodium
R
69
star
87

roxygen2md

Convert elements of roxygen documentation to markdown
R
67
star
88

backports

Reimplementations of Functions Introduced Since R-3.0.0
R
66
star
89

pkgbuild

Find tools needed to build R packages
R
65
star
90

webfakes

Fake web apps for HTTP testing R packages
C
63
star
91

generics

Common generic methods
R
61
star
92

cliapp

Rich Command Line Applications
R
61
star
93

diffviewer

HTML widget to visually compare files
JavaScript
58
star
94

pkgload

Simulate installing and loading a package
R
58
star
95

cachem

Key-value caches for R
R
57
star
96

liteq

Serverless R message queue using SQLite
R
56
star
97

brio

Basic R Input Output
R
53
star
98

carrier

Create standalone functions for remote execution
R
50
star
99

jose

Javascript Object Signing and Encryption for R
R
48
star
100

urlchecker

Run CRAN URL checks from older versions of R
R
45
star