0

I want to group by condition with dynamic input column names.

df:
col1
a
b
c
d
a
c
d
b
a
b
d

I created function like below

fun1 <- function(df,column_name){
  
  col_name1 = noquote(column_name)
  
  out_df = df %>% group_by(col_name1)%>%dplyr::summarise('Count'=n())
                                                              
  return(out_df)
}

where column_name is string. Example: column_name = 'col1'

When apply that function it is giving below error:

Error: Must group by variables found in `.data`.
* Column `col_name1` is not found.

I'm getting above error even though column exists. Where have I gone wrong?

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Navya
  • 307
  • 3
  • 15

2 Answers2

1
library(dplyr)
fun1 <- function(df,column_name){
  
  col_name1 <-  sym(column_name)
  
  out_df <-  df %>% 
    group_by(!!col_name1) %>%
    summarise('Count' = n())
  
  return(out_df)
}

fun1(iris, "Species")

# A tibble: 3 x 2
  Species    Count
  <fct>      <int>
1 setosa        50
2 versicolor    50
3 virginica     50

Also this should work, with the advantage of being able to use multiple strings:

fun1 <- function(df, column_name){
  df %>% 
    group_by(across(one_of(column_name))) %>%
    summarise('Count' = n())
  
}
yogevmh
  • 316
  • 1
  • 5
  • Hi am getting same error. – Navya Jun 21 '21 at 07:50
  • `fun1(iris, "Species")` This works for me – yogevmh Jun 21 '21 at 07:51
  • Yes , now it is worked. Can you explain !! this operator before column name? – Navya Jun 21 '21 at 07:53
  • Calling the `sym` function quotes the string and the `!!` tells the function to use the quoted text (i.e the value of `col_name1`) and not the variable itself. Very good explanation here https://dplyr.tidyverse.org/articles/programming.html – yogevmh Jun 21 '21 at 07:55
0

You can use .data pronoun -

fun1 <- function(df,column_name){

  out_df = df %>% group_by(.data[[column_name]]) %>% summarise(Count = n())
  return(out_df)
}

fun1(df, 'col1')

#  col1  Count
#  <chr> <int>
#1 a         3
#2 b         3
#3 c         2
#4 d         3 

Also this can be written with count which works the same way -

fun2 <- function(df,column_name){
  df %>% count(.data[[column_name]], name = 'Count')
}
fun2(df, 'col1')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213