136

When using the pipe operator %>% with packages such as dplyr, ggvis, dycharts, etc, how do I do a step conditionally? For example;

step_1 %>%
step_2 %>%

if(condition)
step_3

These approaches don't seem to work:

step_1 %>%
step_2 
if(condition) %>% step_3

step_1 %>%
step_2 %>%
if(condition) step_3

There is a long way:

if(condition)
{
step_1 %>%
step_2 
}else{
step_1 %>%
step_2 %>%
step_3
}

Is there a better way without all the redundancy?

mindlessgreen
  • 11,059
  • 16
  • 68
  • 113

6 Answers6

161

Here is a quick example that takes advantage of the . and ifelse:

X<-1
Y<-T

X %>% add(1) %>% { ifelse(Y ,add(.,1), . ) }

In the ifelse, if Y is TRUE if will add 1, otherwise it will just return the last value of X. The . is a stand-in which tells the function where the output from the previous step of the chain goes, so I can use it on both branches.

Edit As @BenBolker pointed out, you might not want ifelse, so here is an if version.

X %>% 
add(1) %>% 
 {if(Y) add(.,1) else .}

Thanks to @Frank for pointing out that I should use { braces around my if and ifelse statements to continue the chain.

John Paul
  • 12,196
  • 6
  • 55
  • 75
  • 8
    I like the post-edit version. `ifelse` seems unnatural for control flow. – Frank Jun 02 '15 at 19:12
  • 8
    One thing to note: if there is a later step in the chain, use `{}`. For example, if you don't have them here, bad things happen (just printing `Y` for some reason): `X %>% "+"(1) %>% {if(Y) "+"(1) else .} %>% "*"(5)` – Frank Jun 02 '15 at 19:20
  • 1
    Use of the magrittr alias `add` would make the example clearer. – ctbrown Aug 27 '15 at 15:49
  • In code golfing terms, this specific example could be written as `X %>% add(1*Y)` but of course that doesn't answer the original question – talat Feb 23 '16 at 21:33
  • I find, as written, both of these return an error message for me, in which argument y is missing from add. I find replacing `{if(Y) add( 1) else .}` with `{if(Y) add(., 1) else .}` seems to work though – ohnoplus Dec 28 '17 at 23:31
  • They don't return error messages on my side (and `add` doesn't have a `y` argument but `e1` and `e2`), but they return the wrong value (`1`), the way you suggest works fine however ad returns `3` – moodymudskipper Sep 03 '18 at 18:22
  • 1
    Why it changes data frame to list containing only the first column? Example: `data.frame(a = c(1,2,3), b = c(4,5,6)) %>% { ifelse(T, ., 0) }` returns `[[1]] [1] 1 2 3` – Karol Daniluk Apr 28 '19 at 12:19
  • {ifelse (Y, x, .)} returns a list, {if Y x else .} returns the same object's class. I do not know what – Captain Tyler Aug 12 '19 at 13:38
  • 2
    One important thing within the conditional block between `{}` is that you must reference the preceding argument of the dplyr pipe (also called LHS) with the dot (.) - otherwise the conditional block does not receive the . argument! – Agile Bean Sep 12 '19 at 08:09
  • Hey @john-paul, may I ask you to check this post here => https://stackoverflow.com/questions/70499321/conditional-values-using-if-else-within-shiny-app-using-tidyverse-and-dplyr-to-g – Luis Dec 27 '21 at 19:01
47

Edit: purrr::when() is deprecated as of {purrr} version 1.0.0

I think that's a case for purrr::when(). Let's sum up a few numbers if their sum is below 25, otherwise return 0.


library("magrittr")
1:3 %>% 
  purrr::when(sum(.) < 25 ~ sum(.), ~0)
#> [1] 6

when returns the value resulting from the action of the first valid condition. Put the condition to the left of ~, and the action to the right of it. Above, we only used one condition (and then an else case), but you can have many conditions.

You can easily integrate that into a longer pipe.

Lorenz Walthert
  • 4,414
  • 1
  • 18
  • 24
23

Here is a variation on the answer provided by @JohnPaul. This variation uses the `if` function instead of a compound if ... else ... statement.

library(magrittr)

X <- 1
Y <- TRUE

X %>% `if`(Y, . + 1, .) %>% multiply_by(2)
# [1] 4

Note that in this case the curly braces are not needed around the `if` function, nor around an ifelse function—only around the if ... else ... statement. However, if the dot placeholder appears only in a nested function call, then magrittr will by default pipe the left hand side into the first argument of the right hand side. This behavior is overridden by enclosing the expression in curly braces. Note the difference between these two chains:

X %>% `if`(Y, . + 1, . + 2)
# [1] TRUE
X %>% {`if`(Y, . + 1, . + 2)}
# [1] 4

The dot placeholder is nested within a function call both times it appears in the `if` function, since . + 1 and . + 2 are interpreted as `+`(., 1) and `+`(., 2), respectively. So, the first expression is returning the result of `if`(1, TRUE, 1 + 1, 1 + 2), (oddly enough, `if` doesn't complain about extra unused arguments), and the second expression is returning the result of `if`(TRUE, 1 + 1, 1 + 2), which is the desired behavior in this case.

For more information on how the magrittr pipe operator treats the dot placeholder, see the help file for %>%, in particular the section on "Using the dot for secondary purposes".

Uwe
  • 41,420
  • 11
  • 90
  • 134
Cameron Bieganek
  • 7,208
  • 1
  • 23
  • 40
  • what is the difference between using `\`ìf\`` and `ifelse`? are they identical in behavior? – Agile Bean Oct 02 '19 at 11:03
  • @AgileBean The behavior of the `if` and `ifelse` functions is not identical. The `ifelse` function is a vectorized `if`. If you provide the `if` function with a logical vector, it will print a warning and it will only use the first element of that logical vector. Compare `\`if\`(c(T, F), 1:2, 3:4)` to `ifelse(c(T, F), 1:2, 3:4)`. – Cameron Bieganek Oct 02 '19 at 15:01
  • great, thanks for the clarification! So as the above problem is non-vectorized, you could have also written your solution as `X %>% { ifelse(Y, .+1, .+2) }` – Agile Bean Oct 03 '19 at 02:51
16

It would seem easiest to me to back off from the pipes a little tiny bit (although I would be interested in seeing other solutions), e.g.:

library("dplyr")
z <- data.frame(a=1:2)
z %>% mutate(b=a^2) -> z2
if (z2$b[1]>1) {
    z2 %>% mutate(b=b^2) -> z2
}
z2 %>% mutate(b=b^2) -> z3

This is a slight modification of @JohnPaul's answer (you might not really want ifelse, which evaluates both of its arguments and is vectorized). It would be nice to modify this to return . automatically if the condition is false ... (caution: I think this works but haven't really tested/thought about it too much ...)

iff <- function(cond,x,y) {
    if(cond) return(x) else return(y)
}

z %>% mutate(b=a^2) %>%
    iff(cond=z2$b[1]>1,mutate(.,b=b^2),.) %>%
 mutate(b=b^2) -> z4
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
13

I like purrr::when and the other base solutions provided here are all great but I wanted something more compact and flexible so I designed function pif (pipe if), see code and doc at the end of the answer.

Arguments can be either expressions of functions (formula notation is supported), and input is returned unchanged by default if condition is FALSE.

Used on examples from other answers:

## from Ben Bolker
data.frame(a=1:2) %>% 
  mutate(b=a^2) %>%
  pif(~b[1]>1, ~mutate(.,b=b^2)) %>%
  mutate(b=b^2)
#   a  b
# 1 1  1
# 2 2 16

## from Lorenz Walthert
1:3 %>% pif(sum(.) < 25,sum,0)
# [1] 6

## from clbieganek 
1 %>% pif(TRUE,~. + 1) %>% `*`(2)
# [1] 4

# from theforestecologist
1 %>% `+`(1) %>% pif(TRUE ,~ .+1)
# [1] 3

Other examples :

## using functions
iris %>% pif(is.data.frame, dim, nrow)
# [1] 150   5

## using formulas
iris %>% pif(~is.numeric(Species), 
             ~"numeric :)",
             ~paste(class(Species)[1],":("))
# [1] "factor :("

## using expressions
iris %>% pif(nrow(.) > 2, head(.,2))
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa

## careful with expressions
iris %>% pif(TRUE, dim,  warning("this will be evaluated"))
# [1] 150   5
# Warning message:
# In inherits(false, "formula") : this will be evaluated
iris %>% pif(TRUE, dim, ~warning("this won't be evaluated"))
# [1] 150   5

Function

#' Pipe friendly conditional operation
#'
#' Apply a transformation on the data only if a condition is met, 
#' by default if condition is not met the input is returned unchanged.
#' 
#' The use of formula or functions is recommended over the use of expressions
#' for the following reasons :
#' 
#' \itemize{
#'   \item If \code{true} and/or \code{false} are provided as expressions they 
#'   will be evaluated wether the condition is \code{TRUE} or \code{FALSE}.
#'   Functions or formulas on the other hand will be applied on the data only if
#'   the relevant condition is met
#'   \item Formulas support calling directly a column of the data by its name 
#'   without \code{x$foo} notation.
#'   \item Dot notation will work in expressions only if `pif` is used in a pipe
#'   chain
#' }
#' 
#' @param x An object
#' @param p A predicate function, a formula describing such a predicate function, or an expression.
#' @param true,false Functions to apply to the data, formulas describing such functions, or expressions.
#'
#' @return The output of \code{true} or \code{false}, either as expressions or applied on data as functions
#' @export
#'
#' @examples
#'# using functions
#'pif(iris, is.data.frame, dim, nrow)
#'# using formulas
#'pif(iris, ~is.numeric(Species), ~"numeric :)",~paste(class(Species)[1],":("))
#'# using expressions
#'pif(iris, nrow(iris) > 2, head(iris,2))
#'# careful with expressions
#'pif(iris, TRUE, dim,  warning("this will be evaluated"))
#'pif(iris, TRUE, dim, ~warning("this won't be evaluated"))
pif <- function(x, p, true, false = identity){
  if(!requireNamespace("purrr")) 
    stop("Package 'purrr' needs to be installed to use function 'pif'")

  if(inherits(p,     "formula"))
    p     <- purrr::as_mapper(
      if(!is.list(x)) p else update(p,~with(...,.)))
  if(inherits(true,  "formula"))
    true  <- purrr::as_mapper(
      if(!is.list(x)) true else update(true,~with(...,.)))
  if(inherits(false, "formula"))
    false <- purrr::as_mapper(
      if(!is.list(x)) false else update(false,~with(...,.)))

  if ( (is.function(p) && p(x)) || (!is.function(p) && p)){
    if(is.function(true)) true(x) else true
  }  else {
    if(is.function(false)) false(x) else false
  }
}
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • "Functions or formulas on the other hand will be applied on the data only if the relevant condition is met." Can you explain why you decided to do so? – mihagazvoda Sep 14 '20 at 06:55
  • So I compute only what I need to compute, but I wonder why I didn't do it with expressions. For some reason it seems I didn't want to use non standard evaluation. I think I have a modified version in my custom functions, I'll update when I get the chance. – moodymudskipper Sep 14 '20 at 08:15
2

A possible solution is to use an anonymous function

library(magrittr)
1 %>% 
  (\(.) if (T) . + 1 else .) %>% 
  multiply_by(2)
Julien
  • 1,613
  • 1
  • 10
  • 26