2

Is there a function that takes one dataset, one col, one operator, but several values to evaluate a condition?

v1 <- c(1:3)
v2 <- c("a", "b", "c")
df <- data.frame(v1, v2)

Options to subset (programmatically)

result <- df[df$v2 == "a" | df$v2 == "b", ]
result
1  1  a
2  2  b

Or, for more robustness

result1 <- df[ df[[2]] == "a" | df[[2]] == "b", ]
result1
  v1 v2
1  1  a
2  2  b

Alternatively, for easier syntax:

library(dplyr)
result2 <- filter(df, v2 == "a" | v2 == "b")
result2
  v1 v2
1  1  a
2  2  b

(Am I right to assume that I can safely use dplyr's filter() inside a function? )

I did not include subset() above as it is known to be for interactive use only.

In all the cases above, one has to repeat the condition (v2 == "a" | v2 == "b").

I'm looking for a function to which I could pass a vector to the argument, like c("a", "b") because I would like to pass a large number of values, and automate the process.

Such function could perhaps be something like:

fun(df, col = v2, operator = "|", value = c("a", "b")

Thank you

jpinelo
  • 1,414
  • 5
  • 16
  • 28

1 Answers1

4

We can use %in% if the number of elements to check is more than 1.

df[df$v2 %in% c('a', 'b'),]
#   v1 v2
#1  1  a
#2  2  b

Or if we use subset, the df$ can be removed

subset(df, v2 %in% c('a', 'b'))

Or the dplyr::filter

filter(df, v2 %in% c('a', 'b'))

This can be wrapped in a function

f1 <- function(dat, col, val){
 filter(dat, col %in%  val)
 }

f1(df, v2, c('a', 'b'))
#  v1 v2
#1  1  a
#2  2  b

If we need to use ==, we could loop the vector to compare in a list and use Reduce with |

df[Reduce(`|`, lapply(letters[1:2], `==`, df$v2)),]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thanks for that. It does solve the issue as it takes 1 or more elements. isn't the opposite of %in%, !%in% ? I think I've used it before. Any idea why it would throw an error inside a function? Thanks – jpinelo Aug 29 '15 at 12:23
  • 1
    @jpinelo You may have to try `filter(df, !v2 %in% c('a', 'b'))` – akrun Aug 29 '15 at 12:26