In R 4.1 a native pipe operator was introduced that is "more streamlined" than previous implementations. I already noticed one difference between the native |>
and the magrittr pipe %>%
, namely 2 %>% sqrt
works but 2 |> sqrt
doesn't and has to be written as 2 |> sqrt()
. Are there more differences and pitfalls to be aware of when using the new pipe operator?
-
8Have you checked out the `?pipeOp` and the `?"%>%"` hep pages? That's a good source of info. – MrFlick May 21 '21 at 08:14
5 Answers
Topic | Magrittr 2.0.3 | Base 4.3.0 |
---|---|---|
Operator | %>% %<>% %$% %!>% %T>% |
|> (since 4.1.0) |
Function call | 1:3 %>% sum() |
1:3 |> sum() |
1:3 %>% sum |
Needs brackets / parentheses | |
1:3 %>% `+`(4) |
Some functions are not supported | |
Insert on first empty place | mtcars %>% lm(formula = mpg ~ disp) |
mtcars |> lm(formula = mpg ~ disp) |
Placeholder | . |
_ (since 4.2.0) |
mtcars %>% lm(mpg ~ disp, data = . ) |
mtcars |> lm(mpg ~ disp, data = _ ) |
|
mtcars %>% lm(mpg ~ disp, . ) |
Needs named argument | |
1:3 %>% setNames(., .) |
Can only appear once | |
1:3 %>% {sum(sqrt(.))} |
Nested calls are not allowed | |
Extraction call | mtcars %>% .$cyl mtcars %>% {.$cyl[[3]]} or mtcars %$% cyl[[3]] |
mtcars |> _$cyl (since 4.3.0) mtcars |> _$cyl[[3]] |
Environment | %>% has additional function environment use: "x" %!>% assign(1) |
"x" |> assign(1) |
Speed | Slower because Overhead of function call | Faster because Syntax transformation |
Many differences and limitations disappear when using |>
in combination with an (anonymous) function:
1 |> (\(.) .)()
-3:3 |> (\(.) sum(2*abs(.) - 3*.^2))()
Have also a look at: How to pipe purely in base R ('base pipe')? and What are the differences and use cases of the five Magrittr Pipes %>%, %<>%, %$%, %!>% and %T>%?.
Needs brackets
library(magrittr)
1:3 |> sum
#Error: The pipe operator requires a function call as RHS
1:3 |> sum()
#[1] 6
1:3 |> approxfun(1:3, 4:6)()
#[1] 4 5 6
1:3 %>% sum
#[1] 6
1:3 %>% sum()
#[1] 6
1:3 %>% approxfun(1:3, 4:6) #But in this case empty parentheses are needed
#Error in if (is.na(method)) stop("invalid interpolation method") :
1:3 %>% approxfun(1:3, 4:6)()
#[1] 4 5 6
Some functions are not supported,
but some still can be called by placing them in brackets, call them via the function ::
, use the placeholder, call it in a function or define a link to the function.
1:3 |> `+`(4)
#Error: function '+' not supported in RHS call of a pipe
1:3 |> (`+`)(4)
#[1] 5 6 7
1:3 |> base::`+`(4)
#[1] 5 6 7
1:3 |> `+`(4, e2 = _)
#[1] 5 6 7
1 |> (`+`)(2) |> (`*`)(3) #(1 + 2) * 3 or `*`(`+`(1, 2), 3) and NOT 1 + 2 * 3
#[1] 9
1:3 |> (\(.) . + 4)()
#[1] 5 6 7
fun <- `+`
1:3 |> fun(4)
#[1] 5 6 7
1:3 %>% `+`(4)
#[1] 5 6 7
Placeholder needs named argument
2 |> setdiff(1:3, _)
#Error: pipe placeholder can only be used as a named argument
2 |> setdiff(1:3, y = _)
#[1] 1 3
2 |> (\(.) setdiff(1:3, .))()
#[1] 1 3
2 %>% setdiff(1:3, .)
#[1] 1 3
2 %>% setdiff(1:3, y = .)
#[1] 1 3
Also for variadic functions with ...
(dot-dot-dot) arguments, the placeholder _
needs to be used as a named argument.
"b" |> paste("a", _, "c")
#Error: pipe placeholder can only be used as a named argument
"b" |> paste("a", . = _, "c")
#[1] "a b c"
"b" |> (\(.) paste("a", ., "c"))()
#[1] "a b c"
Placeholder can only appear once
1:3 |> setNames(nm = _)
#1 2 3
#1 2 3
1:3 |> setNames(object = _, nm = _)
#Error in setNames(object = "_", nm = "_") :
# pipe placeholder may only appear once
1:3 |> (\(.) setNames(., .))()
#1 2 3
#1 2 3
1:3 |> list() |> setNames(".") |> with(setNames(., .))
#1 2 3
#1 2 3
1:3 |> list(. = _) |> with(setNames(., .))
#1 2 3
#1 2 3
1:3 %>% setNames(object = ., nm = .)
#1 2 3
#1 2 3
1:3 %>% setNames(., .)
#1 2 3
#1 2 3
Nested calls are not allowed
1:3 |> sum(sqrt(x=_))
#Error in sum(1:3, sqrt(x = "_")) : invalid use of pipe placeholder
1:3 |> (\(.) sum(sqrt(.)))()
#[1] 4.146264
1:3 %>% {sum(sqrt(.))}
#[1] 4.146264
Extraction call
Experimental feature since 4.3.0. The placeholder _
can now also be used in the rhs of a forward pipe |>
expression as the first argument in an extraction call, such as _$coef
. More generally, it can be used as the head of a chain of extractions, such as _$coef[[2]]
*
mtcars |> _$cyl
mtcars |> _[["cyl"]]
mtcars |> _[,"cyl"]
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
mtcars |> _$cyl[[4]]
#[1] 6
mtcars %>% .$cyl
mtcars %>% .[["cyl"]]
mtcars %>% .[,"cyl"]
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
#mtcars %>% .$cyl[4] #gives mtcars[[4]]
mtcars %>% .$cyl %>% .[4]
#[1] 6
No additional Environment
assign("x", 1)
x
#[1] 1
"x" |> assign(2)
x
#[1] 2
"x" |> (\(x) assign(x, 3))()
x
#[1] 2
1:3 |> assign("x", value=_)
x
#[1] 1 2 3
"x" %>% assign(4)
x
#[1] 1 2 3
4 %>% assign("x", .)
x
#[1] 1 2 3
"x" %!>% assign(4) #Use instead the eager pipe
x
#[1] 4
5 %!>% assign("x", .)
x
#[1] 5
Other possibilities:
A different pipe operator and different placeholder could be realized with the Bizarro pipe ->.;
what is not a pipe (see disadvantages) which is overwriting .
1:3 ->.; sum(.)
#[1] 6
mtcars ->.; .$cyl
# [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
mtcars ->.; .$cyl[4]
#[1] 6
1:3 ->.; setNames(., .)
#1 2 3
#1 2 3
1:3 ->.; sum(sqrt(x=.))
#[1] 4.146264
"x" ->.; assign(., 5)
x
#[1] 5
6 ->.; assign("x", .)
x
#[1] 6
1:3 ->.; . + 4
#[1] 5 6 7
1 ->.; (`+`)(., 2) ->.; (`*`)(., 3)
#[1] 9
1 ->.; .+2 ->.; .*3
#[1] 9
and evaluates different.
x <- data.frame(a=0)
f1 <- \(x) {message("IN 1"); x$b <- 1; message("OUT 1"); x}
f2 <- \(x) {message("IN 2"); x$c <- 2; message("OUT 2"); x}
x ->.; f1(.) ->.; f2(.)
#IN 1
#OUT 1
#IN 2
#OUT 2
# a b c
#1 0 1 2
x |> f1() |> f2()
#IN 2
#IN 1
#OUT 1
#OUT 2
# a b c
#1 0 1 2
f2(f1(x))
#IN 2
#IN 1
#OUT 1
#OUT 2
# a b c
#1 0 1 2
Or define a custom pipe operator which is setting .
to the value of the lhs in a new environment and evaluates rhs in it. But here values in the calling environment could not be created or changed.
`:=` <- \(lhs, rhs) eval(substitute(rhs), list(. = lhs))
mtcars := .$cyl[4]
#[1] 6
1:3 := setNames(., .)
#1 2 3
#1 2 3
1:3 := sum(sqrt(x=.))
#[1] 4.146264
"x" := assign(., 6)
x
#Error: object 'x' not found
1 := .+2 := .*3
#[1] 9
So another try is assigning lhs to the placeholder .
in the calling environment and evaluate the rhs in the calling environment. But here .
will be removed from calling environment in case it was already there.
`?` <- \(lhs, rhs) {
on.exit(if(exists(".", parent.frame())) rm(., envir = parent.frame()))
assign(".", lhs, envir=parent.frame())
eval.parent(substitute(rhs))
}
mtcars ? .$cyl[4]
#[1] 6
1:3 ? setNames(., .)
#1 2 3
#1 2 3
1:3 ? sum(sqrt(x=.))
#[1] 4.146264
"x" ? assign(., 6)
x
#[1] 6
1 ? .+2 ? .*3
#[1] 9
Another possibility will be to replace all .
with lhs so that during evaluation .
does not exists anymore as a name.
`%|>%` <- \(lhs, rhs)
eval.parent(eval(call('substitute', substitute(rhs), list(. = lhs))))
mtcars %|>% .$cyl[4]
[1] 6
1:3 %|>% setNames(., .)
1 2 3
1 2 3
1:3 %|>% sum(sqrt(x=.))
[1] 4.146264
"x" %|>% assign(., 6)
x
#[1] 6
1 %|>% .+2 %|>% .*3
#[1] 7
The name of the used operator influences the operator precedence: See Same function but using for it the name %>% causes a different result compared when using the name :=.
For more advanced options see: Write own / custom pipe operator.
Speed
library(magrittr)
`:=` <- \(lhs, rhs) eval(substitute(rhs), list(. = lhs))
`?` <- \(lhs, rhs) {
on.exit(if(exists(".", parent.frame())) rm(., envir = parent.frame()))
assign(".", lhs, envir=parent.frame())
eval.parent(substitute(rhs))
}
`%|>%` <- \(lhs, rhs)
eval.parent(eval(call('substitute', substitute(rhs), list(. = lhs))))
x <- 42
bench::mark(min_time = 0.2, max_iterations = 1e8
, x
, identity(x)
, "|>" = x |> identity()
, "|> _" = x |> identity(x=_)
, "->.;" = {x ->.; identity(.)}
, "|> f()" = x |> (\(y) identity(y))()
, "%>%" = x %>% identity
, ":=" = x := identity(.)
, "list." = x |> list() |> setNames(".") |> with(identity(.))
, "%|>%" = x %|>% identity(.)
, "?" = x ? identity(.)
)
Result
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl>
1 x 31.08ns 48.2ns 19741120. 0B 7.46 2646587 1
2 identity(x) 491.04ns 553.09ns 1750116. 0B 27.0 323575 5
3 |> 497.91ns 548.08ns 1758553. 0B 27.3 322408 5
4 |> _ 506.87ns 568.92ns 1720374. 0B 26.9 320003 5
5 ->.; 725.03ns 786.04ns 1238488. 0B 21.2 233864 4
6 |> f() 972.07ns 1.03µs 929926. 0B 37.8 172288 7
7 %>% 2.76µs 3.05µs 315448. 0B 37.2 59361 7
8 := 3.02µs 3.35µs 288025. 0B 37.0 54561 7
9 list. 5.19µs 5.89µs 166721. 0B 36.8 31752 7
10 %|>% 6.01µs 6.86µs 143294. 0B 37.0 27076 7
11 ? 30.9µs 32.79µs 30074. 0B 31.3 5768 6

- 37,245
- 2
- 26
- 48
-
3Amazingly comprehensive. Note that `1:3 |> list() |> setNames(".") |> with(setNames(., .))` can be written as `1:3 |> list(. = _) |> with(setNames(., .))` or even `1:3 |> setNames(nm = _)` – G. Grothendieck Jun 30 '23 at 09:36
-
Thanks for the variants also using the `_` placeholder! Maybe I should change to another function as there is no need to use the placeholder on two places here with `setNames`. – GKi Jul 03 '23 at 06:39
In R 4.1, there was no placeholder syntax for the native pipe. Thus, there was no equivalent of the .
placeholder of magrittr and thus the following was impossible with |>
.
c("dogs", "cats", "rats") %>% grepl("at", .)
#[1] FALSE TRUE TRUE
As of R 4.2, the native pipe can use _
as a placeholder but only with named arguments.
c("dogs", "cats", "rats") |> grepl("at", x = _)
#[1] FALSE TRUE TRUE
The .
and magrittr is still more flexible as .
can be repeated and appear in expressions.
c("dogs", "cats", "rats") %>%
paste(., ., toupper(.))
#[1] "dogs dogs DOGS" "cats cats CATS" "rats rats RATS"
c("dogs", "cats", "rats") |>
paste(x = "no", y = _)
# Error in paste(x = "_", y = "_") : pipe placeholder may only appear once
It is also not clear how to use |>
with a function that takes in unnamed variadic arguments (i.e., ...
). In this paste()
example, we can make up x
and y
arguments to trick the placeholder in the correct place, but that feels hacky.
c("dogs", "cats", "rats") |>
paste(x = "no", y = _)
#[1] "no dogs" "no cats" "no rats"
Here are additional ways to work around the place holder limitations-
- Write a separate function
find_at = function(x) grepl("at", x)
c("dogs", "cats", "rats") |> find_at()
#[1] FALSE TRUE TRUE
Use an anonymous function
a) Use the "old" syntax
c("dogs", "cats", "rats") |> {function(x) grepl("at", x)}()
b) Use the new anonymous function syntax
c("dogs", "cats", "rats") |> {\(x) grepl("at", x)}()
Specify the first parameter by name. This relies on the fact that the native pipe pipes into the first unnamed parameter, so if you provide a name for the first parameter it "overflows" into the second (and so on if you specify more than one parameter by name)
c("dogs", "cats", "rats") |> grepl(pattern="at")
#> [1] FALSE TRUE TRUE
- Examples 1 and 2 taken from - https://www.jumpingrivers.com/blog/new-features-r410-pipe-anonymous-functions/
- Example 3 taken from https://mobile.twitter.com/rlangtip/status/1409904500157161477

- 3,846
- 1
- 21
- 22

- 377,200
- 20
- 156
- 213
The base R pipe |>
added in R 4.1.0 "just" does functional composition. I.e. we can see that its use really is just the same as the functional call:
> 1:5 |> sum() # simple use of |>
[1] 15
> deparse(substitute( 1:5 |> sum() ))
[1] "sum(1:5)"
>
That has some consequences:
- it makes it a little faster
- it makes it a little simpler and more robust
- it makes it a little more restrictive:
sum()
here needs the parens for a proper call - it limits uses of the 'implicit' data argument
This leads to possible use of =>
which is currently "available but not active" (for which you need to set the enviornment variable _R_USE_PIPEBIND_
, and which may change for R 4.2.0).
(This was first offered as answer to a question duplicating this over here and I just copied it over as suggested.)
Edit: As the follow-up question on 'what is =>
' comes up, here is a quick follow-up. Note that this operator is subject to change.
> Sys.setenv("_R_USE_PIPEBIND_"=TRUE)
> mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
Call:
lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))
Coefficients:
(Intercept) disp
40.872 -0.135
> deparse(substitute(mtcars |> subset(cyl==4) |> d => lm(mpg ~ disp, data = d)))
[1] "lm(mpg ~ disp, data = subset(mtcars, cyl == 4))"
>
The deparse(substitute(...))
is particularly nice here.

- 81,064
- 34
- 182
- 193

- 360,940
- 56
- 644
- 725
-
Implicit data argument is `.` from magrittr? What do you mean by possible use of `=>`? would pipebind be like `%<>%`? – qwr Jun 03 '21 at 12:25
-
Yes. There are a few example flying around of using `=>` to assign a data element _explicitly_ to a named variable, say `x`, and use that in, say, `lm(...., data=x)`. – Dirk Eddelbuettel Jun 03 '21 at 12:29
-
1I'm still not sure what `=>` does but I suppose that is for a separate question. – qwr Jun 03 '21 at 12:40
-
Exactly. And it is also temporary. But I'll toss in a quick edit. – Dirk Eddelbuettel Jun 03 '21 at 12:58
The native pipe is implemented as a syntax transformation and so 2 |> sqrt()
has no discernible overhead compared to sqrt(2)
, whereas 2 %>% sqrt()
comes with a small penalty.
microbenchmark::microbenchmark(
sqrt(1),
2 |> sqrt(),
3 %>% sqrt()
)
# Unit: nanoseconds
# expr min lq mean median uq max neval
# sqrt(1) 117 126.5 141.66 132.0 139 246 100
# sqrt(2) 118 129.0 156.16 134.0 145 1792 100
# 3 %>% sqrt() 2695 2762.5 2945.26 2811.5 2855 13736 100
You see how the expression 2 |> sqrt()
passed to microbenchmark
is parsed as sqrt(2)
. This can also be seen in
quote(2 |> sqrt())
# sqrt(2)

- 8,296
- 3
- 33
- 48
One difference is their placeholder, _
in base R, .
in magrittr
.
Since R 4.2.0, the base R pipe has a placeholder for piped-in values, _
, similar to %>%
's .
, but its use is restricted to named arguments, and can only be used once per call.
It is now possible to use a named argument with the placeholder _ in the rhs call to specify where the lhs is to be inserted. The placeholder can only appear once on the rhs.
To reiterate Ronak Shah's example, you can now use _
as a named argument on the right-hand side to refer to the left-hand side of the formula:
c("dogs", "cats", "rats") |>
grepl("at", x = _)
#[1] FALSE TRUE TRUE
but it has to be named:
c("dogs", "cats", "rats") |>
grepl("at", _)
#Error: pipe placeholder can only be used as a named argument
and cannot appear more than once (to overcome this issue, one can still use the solutions provided by Ronak Shah):
c("dogs", "cats", "rats") |>
expand.grid(x = _, y = _)
# Error in expand.grid(x = "_", y = "_") : pipe placeholder may only appear once
While this is possible with magrittr
:
library(magrittr)
c("dogs", "cats", "rats") %>%
expand.grid(x = ., y = .)
# x y
#1 dogs dogs
#2 cats dogs
#3 rats dogs
#4 dogs cats
#5 cats cats
#6 rats cats
#7 dogs rats
#8 cats rats
#9 rats rats

- 45,206
- 3
- 29
- 67
-
4Does it make any sense to restrict the use for just once? Do they have a plan to remove this constraint? – GitHunter0 Apr 26 '22 at 17:40
-
2If I had to guess (I'm not R-core), it's because these operators (`|>`, etc) rewrite the syntax so that `longcalc() |> quux(x = _)` into `quux(x = longcalc())`, and they don't want `longcalc() |> quux(x=_, y=)` to translate into double the calc with `quux(x=longcalc(), y=longcalc())` (where the second is a redundant and double-the-time call). Just a guess, though. @GitHunter0 – r2evans Sep 09 '22 at 11:38