I have a workflow where I supply a vector of strings representing column names to a function that uses group_by on those columns. It works when I test it with one column name, but fails when I pass it multiples.
The basic setup is this:
group_summs <- function(df, grouping_vars) {
if(length(grouping_vars == 1)) {
group_var <- ensym(grouping_vars)
df %>%
group_by(!! group_var) %>%
summarise(n_test = n())
} else {
group_vars <- grouping_vars
df %>%
group_by_at(.vars = group_vars) %>%
summarise(n_test = n())
}
}
#Single column test
flights <- nycflights13::flights
col_test <- c("origin")
#This Works
group_summs(flights, col_test)
#Multiple columns test
col_test_2 <- c("origin", "carrier")
#This fails
group_summs(flights, col_Test_2)
So as a test I can pass a single column name and have it run, but when I run it with multiples I get an rlang error.
"Error: Only strings can be converted to symbols
Call rlang::last_error()
to see a backtrace
Called from: rlang::abort(x)"
What I really don't get is why the multiple column example runs correctly outside of the function as in:
#Runs just fine
col_test_2 <- c("origin", "carrier")
flights %>% group_by_at(.vars = col_test_2) %>% summarise(n_test = n())
Is there something about the function environment that I am not understanding, or is this a buggy behavior?
I am using dplyr (0.8.3) and rlang (0.4.0).
This question is very similar to Group by multiple columns in dplyr, using string vector input but the solutions on that question result in the same error so I wonder if there is now a more recent solution (Their current solution from 2017).