2

While using the pipe the lhs is automatically placed as the first argument of the rhs. This leads to the fact that both codes are equal:

library(dplyr)
1:10 %>% mean(na.rm = TRUE) # same as...
1:10 %>% mean(., na.rm = TRUE)

I just come into a situation where I needed the pipe to stop automatically using . as the first argument (as asked here). And in fact I never do 1:10 %>% mean(na.rm = TRUE), i.e. I always explicitly supply the first argument with . while using the pipe. Therefore I am wondering whether it is possible to prevent the pipe from always automatically put . as the first argument.

This means %>% should behave by default as if the rhs was in curly brackets:

1:10 %>% {mean(na.rm = TRUE)} 
# Returns an error since x is missing. This is expected! Again: I
# want %>% not to automatically provide the first argument with .

There is a similiar question. But in my case I want %>% by default prevent from this behavior.

LulY
  • 976
  • 1
  • 9
  • 24
  • I don't think so. You could write your own pipe operator ... – Ben Bolker May 02 '23 at 13:16
  • Hmm this is *basically* a duplicate of the linked question. But in case the other question hadn't made it clear yet: the only way to do this with the ‘magrittr’ pipe is to use `{…}`. If you do not like this you will have to write your own pipe operator (which, if you always want to pass `.` explicitly, isn’t hard). – Konrad Rudolph May 02 '23 at 13:22
  • @KonradRudolph "(which, if you always want to pass . explicitly, isn’t hard)": Yes I want to and I do so (as stated in the question). And yes, I wanted to see how "to write your own pipe operator" – LulY May 02 '23 at 13:25
  • 1
    You can use `|>`, which uses `_` (with a much more limited scope) and never uses `.` – r2evans May 02 '23 at 13:26
  • 2
    @r2evans But using `1:10 |> mean(na.rm = TRUE)` works although first argument is not provided? – LulY May 02 '23 at 13:36

2 Answers2

4

A pipe that explicitly replaces a placeholder can be implemented (fairly) straightforwardly since the base R function substitute implements the bulk of this. The one caveat is that substitute expects an unquoted expression, so we need to work around that, and we obviously need to evaluate the resulting expression:

`%>%` = function (lhs, rhs) {
  subst = call('substitute', substitute(rhs), list(. = lhs))
  eval.parent(eval(subst))
}

Here I am reusing the %>% operator but if you are concerned about conflicts with ‘magrittr’ you may obviously use another operator, e.g. %|%.

1:5 %>% sum(na.rm = TRUE, .) %>% identity()
# Error in identity() : argument "x" is missing, with no default

1:5 %>% sum(na.rm = TRUE, .) %>% identity(.)
# [1] 15

The above works by replacing each occurrence of . in the RHS with the value provided by the LHS. That is, during evaluation . is not a name. This is usually the expected and desired semantic. However, it means that you cannot assign to . inside the RHS (including calling replacement functions), because . is not a name.

So something like {names(.) = "foo"; .} does not work.

We can fix this with a different implementation, which does not replace . with the LHS but rather injects a definition of . into the environment where the RHS is evaluated:

`%>%` = function (lhs, rhs) {
  eval_env = new.env(parent = parent.frame())
  eval_env$. = lhs
  eval(substitute(rhs), envir = eval_env)
}

Now we can use . as a name in assignments:

1 %>% {names(.) = "foo"; .}
# foo
#   1

However, some other things no longer work, because we now evaluate the expression in a different environment:

1:5 %>% assign("x", .)
x
# Error: object 'x' not found

… whereas this did work with the first pipe implementation. We could make it work again by injecting . directly into the calling environment but this messes with the user environment and I would strongly discourage doing that.1

Instead, if you want to do things like calling assign within a pipeline expression, be explicit into which environment you want to assign (i.e. pass something like envir = the_environment to assign).


1 You could avoid messing with the user environment by carefully preserving the user state and cleaning up afterwards; but this leads to a much more complex (and error-prone) implementation:

`%>%` = function (lhs, rhs) {
  caller = parent.frame()
  
  if (exists('.', envir = caller, inherits = FALSE)) {
    stored_dot = caller$.
    on.exit({caller$. = stored_dot})
  } else {
    on.exit(rm(., envir = caller))
  }
  
  caller$. = lhs
  eval.parent(substitute(rhs))
}

(This is a simplified implementation; do not use it! In particular, it fails if . refers to an active binding in the parent environment. We could add handling for active bindings — with difficulty — but it’s entirely likely that I forgot some other edge case; like I said, this is error-prone.)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Interestingly it is not for all code same as `{rhs}`. Example: `data.frame(1:10, row.names = letters[1:10]) %newpipe% rownames(.) <- 1:nrow(.); .` does not work but `data.frame(1:10, row.names = letters[1:10]) %>% {rownames(.) <- 1:nrow(.); .}` does. – LulY May 02 '23 at 13:50
  • Wow there is a *lot* to learn for me from this answer! Thanks for that! – LulY May 02 '23 at 15:01
2

As in the comment was define your own function.
Here . is set to the value of the lhs in a new environment where the rhs is evaluated.

`:=` <- function(lhs, rhs) eval(substitute(rhs), list(. = lhs))

1:10 := mean(., na.rm = TRUE)
#[1] 5.5

1:10 := mean(na.rm = TRUE)
#Error in mean.default(na.rm = TRUE) : 
#  argument "x" is missing, with no default

data.frame(1:10, row.names = letters[1:10]) := {rownames(.)  <- 1:nrow(.); .}
#   X1.10
#1      1
#2      2
#3      3
#4      4
#5      5
#6      6
#7      7
#8      8
#9      9
#10    10

1 := {names(.) = "foo"; .}
#foo
#  1

1 %>% assign("x", .)
x
# Error: object 'x' not found

The used operator influences the evaluation. See: Same function but using for it the name %>% causes a different result compared when using the name :=.

`%|%` <- `:=`

1 := . + 2 := . * 3
#[1] 9

1 %|% . + 2 %|% . * 3
#[1] 7
GKi
  • 37,245
  • 2
  • 26
  • 48
  • 2
    Please, *please*, for all that's holy **please** don't use string quotes to make syntactic names. R should not allow this, it is a huge flaw in the language and it terminally confuses beginners. Use backtick quotes. Besides this it would be good if you discussed the caveats of this rather hacky solution. – Konrad Rudolph May 02 '23 at 13:27
  • Huh ... perhaps there are other operators one could define besides `:=`? It's used extensively in `data.table` (and even defined in `rlang`), there are likely collisions here (though I recognize the question is mostly dplyr). – r2evans May 02 '23 at 13:27
  • 1
    Nice edit, thanks. For the sake of completeness, one could also name it `%cecinestpasunpipe%` ;-) – r2evans May 02 '23 at 13:30
  • @KonradRudolph Would you mind when I also ask a SO question about string quotes and syntactic names? – GKi May 26 '23 at 07:13
  • @GKi Please go ahead. But the short answer is that `"a" = 1` is valid R and treats `"a"` as a variable name (?!?!). but `b = "a"` does *not* treat `"a"` as a variable name. **This is idiotic**. There's really not much more to say on the topic. – Konrad Rudolph May 26 '23 at 07:26
  • I have asked a question to allow finding this topic easier than in a comment. https://stackoverflow.com/questions/76342471 – GKi May 26 '23 at 16:01