0

Thanks for looking at this!

I want a function to build tables showing stats, such as the mean) for specific variables segrgated into groups.

Below is a start of a function that works up to a point! I use an example using the built in data for mtcars.

MeansbyGroup<-function(var){
  M1<-mtcars %>% group_by(cyl)
  n1=deparse(substitute(var))
  r1<-transpose(M1 %>% summarise(disp=mean(var)))[2,]
}


# EXAMPLE using mtcars

df=MeansbyGroup(mtcars$disp)
df[nrow(df) + 1,] =MeansbyGroup(mtcars$drat)
df

# The above will output
           V1         V2         V3
2   230.721875 230.721875 230.721875
2.1   3.596563   3.596563   3.596563

#which is not even the right means!

#below are the correct values...but I can't automate a table like I want
M1<-mtcars %>% group_by(cyl)
transpose(M1 %>% summarise(disp=mean(disp)))[2,]
transpose(M1 %>% summarise(disp=mean(drat)))[2,]

## Here is my desired output of means disaggregated into columns by the group "cyl"
## if the function worked right with the above example

           V1         V2         V3
disp   105.1364 183.3143 353.1
drat   4.070909 3.585714 3.229286

As you will see, in the function I have "n1=deparse(substitute(var))" to capture the variable name which I would like to have in the first column, instead of 2 and 2.1 as shown in the example output.

I've tried a few techniques, but when I try to add n1 to the vector, it destroys the values of the means!

Also, I'd like to make the function more generalizable. For this example, I'd prefer the function call to look like MeansbyGroup(var,group,dataframe), which in the above example would be called by MeansbyGroup(disp,cyl,mtcars).

Thanks!

DaveR7
  • 31
  • 2
  • In your function, you define `n1=deparse(substitute(var))`, but then you don't use `n1` ever, you continue using `var`. – Gregor Thomas Nov 08 '22 at 04:07
  • It seems a very strange function that (a) does not take the data as input, `mtcars` is hard-coded inside the function (b) it uses `deparse(substitute())` which is usually use to convert a column name to a string, e.g., `disp` to `"disp"`., but (c) when you call it you don't give it an unquoted column name, you give it the extracted column `mtcars$disp`. You're note that you'd like to generalize it to `MeansbyGroup(var,group,dataframe)` is a much better idea, though I'd generally recommend having the data frame as the **first** argument so it is pipe-able. – Gregor Thomas Nov 08 '22 at 04:13
  • 1
    Have you looked at the R-FAQ we have on [Using variable names in functions with dplyr](https://stackoverflow.com/a/56830842/903061)? Or read the official [Vignette on programming with dplyr](https://dplyr.tidyverse.org/articles/programming.html)? `deparse(substitute())` can work, but recent versions of `dplyr` have much friendlier options. – Gregor Thomas Nov 08 '22 at 04:18

1 Answers1

0

Here's how I would code your table outside of a function:

library(dplyr)
library(tibble)
mtcars %>% 
  group_by(cyl) %>%
  summarize(across(c(disp, drat), mean)) %>%
  column_to_rownames("cyl") %>%
  t
#               4          6          8
# disp 105.136364 183.314286 353.100000
# drat   4.070909   3.585714   3.229286

Using across if you might have multiple variables is quite nice. Putting this inside a function, we will need to use deparse(substitute()) because column_to_rownames requires a string argument for the column. But for the others we can use the friendly {{:

foo = function(data, group, vars) {
  grp_name = deparse(substitute(group))
  data %>% 
    group_by({{group}}) %>%
    summarize(across({{vars}}, mean)) %>%
    column_to_rownames(grp_name) %>%
    t
}

foo(data = mtcars, group = cyl, vars = c(disp, drat))
#               4          6          8
# disp 105.136364 183.314286 353.100000
# drat   4.070909   3.585714   3.229286
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • I tried your function, but am running into an error with the group name "not found." For example, using another built in data set: foo(data = InsectSprays, group = spray, vars = c(count)) The error message is "Column `spray` is not found." – DaveR7 Nov 08 '22 at 14:49
  • Oops, I left `mtcars` in the function. Updated to `data` and it should work. – Gregor Thomas Nov 08 '22 at 15:29
  • Perfect! Though now I'm embarrassed that I didn't spot it!! Thanks! – DaveR7 Nov 08 '22 at 17:25