3

There are a number of questions and answers about summarising multiple variables by one or more groups (e.g., Means multiple columns by multiple groups). I don't think this is a duplicate.

Here's what I'm trying to do: I want to calculate the mean for 4 variables by Displacement, then calculate the mean for those same three by Horsepower, and so on. I don't want to group by vs, am, gear, and carb simultaneously (i.e., I'm not looking for simply mydata %>% group_by(vs, am, gear, and carb) %>% summarise_if(...).

How can I calculate the means for a set of variables by Displacement, then calculate the means for that same set of variables by Horsepower, etc., then place in a table side by side?

I tried to come up with a reproducible example but couldn't. Here is a tibble from mtcars that shows what I'm ultimately looking for (data is made up):

tibble(Item = c("vs", "am" ,"gear", "carb"), 
   "Displacement (mean)"  = c(2.4, 1.4, 5.5, 1.3),
   "Horsepower (mean)" = c(155, 175, 300, 200))
Daniel
  • 415
  • 1
  • 6
  • 16
  • I don't understand what you're trying to do. So a reproducible example would help a lot. Let's take `mtcars`: Calculating the mean of `mpg` by `cyl` gives you the three `mpg` averages for `cyl = 4`, `cyl = 6` and `cyl = 6`. Then you want to group by another variable? Perhaps `am`; this gives you two `mpg` averages for `am = 0` and `am = 1`. Then what? – Maurits Evers Aug 09 '18 at 21:57
  • So, for example, you want to get from the original `mtcars` dataset to the summary `tibble` you have provided? – Chris Aug 09 '18 at 22:04
  • @MauritsEvers, my apologies for the confusion. Dan Y (below) describes the ultimate outcome I'm trying to get to. – Daniel Aug 09 '18 at 22:25
  • After your edit there was still no mapping of the `mtcars` data to the `var1`, `var2` in your question text, but based on your edit I think I know what you meant. Please double check the changes in my edit. – Hack-R Aug 09 '18 at 22:48
  • 1
    @Hack-R, that's exactly what I was going for. Thank you for reading my mind :-) – Daniel Aug 09 '18 at 22:49

2 Answers2

4

Perhaps something like this using purrr::map and some rlang syntax?

grps <- list("cyl", "vs")
map(setNames(grps, unlist(grps)), function(x)
mtcars %>%
    group_by(!!rlang::sym(x)) %>%
    summarise(mean.mpg = mean(mpg), mean.disp = mean(disp)) %>%
    rename(id.val = 1)) %>%
bind_rows(.id = "id")
## A tibble: 5 x 4
#  id    id.val mean.mpg mean.disp
#  <chr>  <dbl>    <dbl>     <dbl>
#1 cyl       4.     26.7      105.
#2 cyl       6.     19.7      183.
#3 cyl       8.     15.1      353.
#4 vs        0.     16.6      307.
#5 vs        1.     24.6      132.
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
1

With so few groupings, why not do each set of means one at a time:

out1 <- mydata %>% group_by(Var1) %>% 
    summarise(mean_1a = mean(var_a), mean_1b = mean(var_b))

out2 <- mydata %>% group_by(Var2) %>% 
    summarise(mean_2a = mean(var_a), mean_2b = mean(var_b))

out3 <- mydata %>% group_by(Var3) %>% 
    summarise(mean_3a = mean(var_a), mean_3b = mean(var_b))

If it makes sense to place the results side-by-side, you could do so with something like:

result <- cbind(out1, out2, out3)
DanY
  • 5,920
  • 1
  • 13
  • 33
  • 1
    yes, this is exactly the ultimate outcome I'm looking for in terms of placing them side by side. Is there a way to do this without creating separate objects? – Daniel Aug 09 '18 at 22:24
  • 1
    Nothing immediately comes to mind except writing one very long `cbind()`. Maybe just run `rm(out1, out2, out3)` to clean up the workspace after you `cbind`. – DanY Aug 09 '18 at 22:27