I used tidyeval to write a short function which takes grouping variables as an input, groups the mtcars dataset and counts the number of occurences per group:
test_function <- function(grps){
mtcars %>%
group_by(across({{grps}})) %>%
summarise(Count = n())
}
test_function(grps = c(cyl, gear))
---
cyl gear Count
<dbl> <dbl> <int>
1 4 3 1
2 4 4 8
3 4 5 2
4 6 3 2
5 6 4 4
6 6 5 1
7 8 3 12
8 8 5 2
Now imagine for that example I want a subtotal column for each group cyl
. So how many cars have 4 (6,8) cylinders? This is what the result should look like:
test_function(grps = c(cyl, gear), subtotalrows = TRUE) ### example function execution
---
cyl gear Count
<dbl> <dbl> <int>
1 4 3 1
2 4 4 8
3 4 5 2
4 4 total 11
5 6 3 2
6 6 4 4
7 6 5 1
8 6 total 7
9 8 3 12
10 8 5 2
11 8 total 14
In this case the subtotal columns I am looking for can simply be produced with the same function but with one less grouping variable:
test_function(grps = cyl)
---
cyl Count
<dbl> <int>
1 4 11
2 6 7
3 8 14
But since I don't want to use the function in itself (not even sure wether this is possible in R) I would like to go for a different approach: As far as I know the best (and only way) to create subtotal rows so far is by calculating them independently and then binding them row wise to the grouped table (i.e.: rbind, bind_rows). In my case that means only take the first grouping variable, create the subtotal rows and later on bind them to the table. But here is where I have problems with the tidyeval syntax. Here is in pseudocode what I would like to do in the function:
test_function <- function(grps, subtotalrows = TRUE){
grouped_result <- mtcars %>%
group_by(across({{grps}})) %>%
summarise(Count = n())
if(subtotalrows == FALSE){
return(grouped_result)
} else {
#pseudocode
group_for_subcalculation <- grps[[1]] #I want the first element of the grps argument
subtotal_result <- mtcars %>%
group_by(across({{group_for_subcalculation}})) %>%
summarise(Count = n()) %>%
mutate(grps[[2]] := "total") %>%
arrange(grps[[1]], grps[[2]], Count)
return(rbind(grouped_result, subtotal_result))
}
}
So, two questions: I am curious how I can extract the first column name passed by grps
and work with it in the following code. Second, this pseudocode example is specific for 2 columns passed by grps
. Imagine I want to pass 3 or more even. How would you do that (loops)?