2

I am having issues with pipes inside a custom function. Based on the previous posts, I understand that a pipe inside a function creates another level(?) which results in the error I'm getting (see below).

I'm hoping to write a summary function for a large data set with hundreds of numeric and categorical variables. I would like to have the option to use this on different data frames (with similar structure), always group by a certain factor variable and get summaries for multiple columns.

library(tidyverse)
data(iris)

iris %>% group_by(Species) %>% summarise(count = n(), mean = mean(Sepal.Length, na.rm = T))

# A tibble: 3 x 3
  Species    count  mean
  <fct>      <int> <dbl>
1 setosa        50  5.01
2 versicolor    50  5.94
3 virginica     50  6.59

I'm hoping to create a function like this:

sum_cols <- function (df, col) { 
df %>% 
group_by(Species) %>% 
summarise(count = n(), 
mean = mean(col, na.rm = T)) 
}

And this is the error I'm getting:

sum_cols(iris, Sepal.Length)

Error in mean(col, na.rm = T) : object 'Petal.Width' not found
Called from: mean(col, na.rm = T)

I have had this problem for a while and even though I tried to get answers in a few previous posts, I haven't quite grasped why the problem occurs and how to get around it.

Any help would be greatly appreciated, thanks!

Karina
  • 25
  • 7

1 Answers1

3

Try searching for non-standard evaluation (NSE).

You can use here {{}} to let R know that col is the column name in df.

library(dplyr)
library(rlang)

sum_cols <- function (df, col) { 
  df %>% 
    group_by(Species) %>% 
    summarise(count = n(), mean = mean({{col}}, na.rm = T)) 
  }

sum_cols(iris, Sepal.Length)

# A tibble: 3 x 3
#  Species    count  mean
#  <fct>      <int> <dbl>
#1 setosa        50  5.01
#2 versicolor    50  5.94
#3 virginica     50  6.59

If we do not have the latest rlang we can use the old method of enquo and !!

sum_cols <- function (df, col) { 
   df %>% 
     group_by(Species) %>% 
     summarise(count = n(), mean = mean(!!enquo(col), na.rm = T)) 
}

sum_cols(iris, Sepal.Length)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for your help! Unfortunately, I'm still getting the same error. Any idea why? – Karina Apr 03 '20 at 10:25
  • 1
    @Karina Do you have `rlang` 0.4.0 or higher installed? Load the library with `library(rlang)` and then try this. – Ronak Shah Apr 03 '20 at 10:29
  • I have 0.3.1 - I have a work computer, and installing and updating packages is not up to me, sadly. I tried it on my personal computer, and it worked fine so it does seem to be an issue with the older versions. – Karina Apr 03 '20 at 10:35
  • Ok...updated the answer which should work similarly. – Ronak Shah Apr 03 '20 at 10:38