10

Is there a way to pipe in base R, without having to define your own function (i.e. something 'out of the box'), and without having to load any external packages?

That is, some (base R) alternative to the magrittr pipe %>%. Possibly in a forthcoming version of R (?)

Is this functionality available in R 4.0.3. If not, which R version is it found in, and if so, how is this achieved?

stevec
  • 41,291
  • 27
  • 223
  • 311
  • 5
    The base R `|>` operator is supposedly under development. Check out the [useR! keynote](https://youtu.be/X_eDHNVceCU?t=4082) – Ian Campbell Dec 16 '20 at 18:58
  • 2
    The core R devs are working on `|>` as a base-R pipe operator. It'll be similar to magrittr's `%>%`. I don't know it's expected release date. – r2evans Dec 16 '20 at 18:58
  • 1
    Right now the base R pipe is in the dev version of R – Andrew Dec 16 '20 at 18:58
  • @r2evans, |> is not that similar. It does not support placeholders (you must create an anonymous function instead) and functions on the right hand side with one arg must be written as f(). f without parentheses after it is not supported – G. Grothendieck Dec 16 '20 at 22:44
  • 1
    You said without defining own function but as a comment, this will work well if you use explicit `.` all the time `'%>%' <- function (lhs, rhs) eval(substitute(rhs), envir = list(. = lhs), enclos = parent.frame())`. e.g. `iris %>% head(.)` – moodymudskipper Dec 23 '20 at 00:13
  • 1
    [Related](https://stackoverflow.com/q/67633022/13460602) – Maël Apr 28 '22 at 13:42

3 Answers3

20

In R |> is used as a pipe operator. (Since 4.1.0)

The left-hand side expression lhs is inserted as the first free argument in the call of to the right-hand side expression rhs.

mtcars |> head()                      # same as head(mtcars)
#                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
#Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

mtcars |> head(2)                     # same as head(mtcars, 2)
#              mpg cyl disp  hp drat    wt  qsec vs am gear carb
#Mazda RX4      21   6  160 110  3.9 2.620 16.46  0  1    4    4
#Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4

It is also possible to use a named argument with the placeholder _ in the rhs call to specify where the lhs is to be inserted. The placeholder can only appear once on the rhs. (Since 4.2.0)

mtcars |> lm(mpg ~ disp, data = _)
#mtcars |> lm(mpg ~ disp, _)  #Error: pipe placeholder can only be used as a named argument
#Call:
#lm(formula = mpg ~ disp, data = mtcars)
#
#Coefficients:
#(Intercept)         disp  
#   29.59985     -0.04122  

Alternatively explicitly name the argument(s) before the "one":

mtcars |> lm(formula = mpg ~ disp)

In case the placeholder is used more than once or used as a named or also unnamed argument on any position or for disabled functions: Use an (anonymous) function.

mtcars |> (\(.) .[.$cyl == 6,])()
#mtcars ->.; .[.$cyl == 6,]           # Alternative using bizarro pipe
#local(mtcars ->.; .[.$cyl == 6,])    # Without overwriting and keeping .
#                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#Merc 280C      17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6

mtcars |> (\(.) lm(mpg ~ disp, .))()
#Call:
#lm(formula = mpg ~ disp, data = .)
#
#Coefficients:
#(Intercept)         disp  
#   29.59985     -0.04122

1:3 |> setNames(object = _, nm = _)
#Error in setNames(object = "_", nm = "_") : 
#  pipe placeholder may only appear once
1:3 |> (\(.) setNames(., .))()
#1 2 3 
#1 2 3

1:3 |> list() |> setNames(".") |> with(setNames(., .))
#1 2 3 
#1 2 3 

#The same but over a function
._ <- \(data, expr, ...) {
  eval(substitute(expr), list(. = data), enclos = parent.frame())
}
1:3 |> ._(setNames(., .))
#1 2 3 
#1 2 3 

Some function are disabled.

1:3 |> `+`(4)
#Error: function '+' not supported in RHS call of a pipe

But some still can be called by placing them in round brackets, call them via the function ::, call it in a function or define a link to the function.

1:3 |> (`+`)(4)
#[1] 5 6 7

1:3 |> base::`+`(4)
#[1] 5 6 7

1:3 |> (\(.) . + 4)()
#[1] 5 6 7

fun <- `+`
1:3 |> fun(4)
#[1] 5 6 7

An expression written as x |> f(y) is parsed as f(x, y). While the code in a pipeline is written sequentially, regular R semantics for evaluation apply. So piped expressions will be evaluated only when first used in the rhs expression.

-1 |> sqrt() |> (\(x) 0)()
#[1] 0

. <- -1
. <- sqrt(.)
#Warning message:
#In sqrt(.) : NaNs produced
(\(x) 0)(.)
#[1] 0


x <- data.frame(a=0)
f1 <- \(x) {message("IN 1"); x$b <- 1; message("OUT 1"); x}
f2 <- \(x) {message("IN 2"); x$c <- 2; message("OUT 2"); x}

x|> f1() |> f2()
#IN 2
#IN 1
#OUT 1
#OUT 2
#  a b c
#1 0 1 2

f2(f1(x))
#IN 2
#IN 1
#OUT 1
#OUT 2
#  a b c
#1 0 1 2

. <- x
. <- f1(.)
#IN 1
#OUT 1
f2(.)
#IN 2
#OUT 2
#  a b c
#1 0 1 2
zx8754
  • 52,746
  • 12
  • 114
  • 209
GKi
  • 37,245
  • 2
  • 26
  • 48
  • 2
    Looking forward to trying! Any noticeable differences from the magrittr pipe? Does it use `.` to represent the placeholder for the passed through argument? – stevec May 18 '21 at 10:23
  • 1
    lambda function `\\(x)` should also be available : `mtcars |> subset(cyl == 4) |> (\\(d) lm(mpg ~ disp, data = d))() – Waldi May 18 '21 at 13:21
  • @stevec You can use `|> . =>` to have a placeholder (see example in the answer), but this needs to be explicitly activated, what signals this this might change in future. – GKi Jun 08 '21 at 07:23
  • At this point, the use of `=>` should not be encouraged anymore, since it will probably be dropped according to [Luke Tierney](https://www.mail-archive.com/r-devel@r-project.org/msg44046.html) – shs Jan 18 '22 at 10:50
  • All nice and all, but in R 4.1.2 after running your second example with `data = _`, I get `Error: unexpected input in "mtcars |> lm(mpg ~ disp, data = _"`. So the underscore does not seem to be supported in this recent version of R. – MS Berends Apr 28 '22 at 12:10
  • 2
    Ah, so the documentation on R 4.2.0 (released just 3 weeks ago) states _"In a forward pipe |> expression it is now possible to use a named argument with the placeholder _ in the rhs call to specify where the lhs is to be inserted. The placeholder can only appear once on the rhs."_. You should state in your answer it requires the VERY last and recently released version of R. – MS Berends Apr 28 '22 at 12:13
14

You can use the bizarro pipe (also this) which is just clever use of existing syntax and requires no functions or packages. e.g.

 mtcars ->.;
  transform(., mpg = 2 * mpg) ->.;   # change units
  lm(mpg ~., .) ->.;
  coef(.)

where ->.; looks something like a pipe.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
8

As of the writing of this answer, the release version of R (4.0.3) does not include a pipe operator.

However, as was noted in the useR! 2020 keynote, the base |> operator is under development.

From the pipeOp man page from the R-devel daily source for 2020-12-15:

A pipe expression passes, or pipes, the result of the lhs expression to the rhs expression.

If the rhs expression is a call, then the lhs is inserted as the first argument in the call. So x |> f(y) is interpreted as f(x, y). To avoid ambiguities, functions in rhs calls may not be syntactically special, such as + or if.

If the rhs expression is a function expression, then the function is called with the lhs value as its argument. This is useful when the lhs needs to be passed as an argument other than the first in the rhs call.

When this operator will make it into the release version, or if it may change prior to release, is unknown.

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57