0

I am stuck trying to do something that should be simple: Use grep() to test pattern matching on a string for multiple variables in a single dataframe. All searches for this lead me to instructions on how to grep() on multiple patterns.

Create data:

df <- data.frame(a = c("apple", "plum", "pair", "apple"), 
                 b = c(1, 2, 3, 4), 
                 c = c("plum", "apple", "grape", "orange"))
df
      a b      c
1 apple 1   plum
2  plum 2  apple
3  pair 3  grape
4 apple 4 orange

Now i want to check df$a and df$c for the string "apple". I want to do this because i want the values from df$b for all rows with "apple" in either df$a or df$c.

My hope was to create a function: f(x)::grep("apple", df$x), and use lapply to test it over the list of variable names that i want to check for the pattern:

check_apple <- function(x) {
   grep("apple", df$x)
}

But this doesn't work:

check_apple(a)
integer(0)

However this does work:

grep("apple", df$a)
[1] 1 4

Why doesn't this function work? Can I not use a a variable name as an argument in my function?

My plan was to apply the function to all the variables and them collapse the resulting list to single vector before selecting unique() values to get all the rows in the dataframe that have variables with a string match in them. It goes without saying that my dataset is much larger than this example.

Can i fix the function, or is there another way to run grep() over multiple variables?

oguz ismail
  • 1
  • 16
  • 47
  • 69
MorrisseyJ
  • 1,191
  • 12
  • 19
  • 1
    in your function: ```grep("apple", df[,x])``` where `x` is a character (i.e. `"a"`). – M-- May 22 '19 at 19:51
  • https://stackoverflow.com/questions/2641653/pass-a-data-frame-column-name-to-a-function – M-- May 22 '19 at 19:51

1 Answers1

0

Your function doesn't work because you are trying to pass an object a to the function, but that object doesn't exist in your environment. The function is failing quietly and not making it clear that this is happening, which can be one of the challenging things in R.

One way to get your function to work is to pass the name of the column as a character to the function, and find the right column in the dataframe:

check_apple <- function(x) {
 grep("apple", df[, x])
}


check_apple('a')
[1] 1 4
astrofunkswag
  • 2,608
  • 12
  • 25