3

I want to use dplyr::summarise_all() and weighted.mean to calculate the weighted averages of many columns for each group.

I tried to directly use anonymous function, but it returned an error: 'x' and 'w' must have the same length. I know I can use summarise() and weighted.mean, but in this way I need to specify all the column names, which is not what I want.

result = df%>%
  group_by(A)%>%
  summarise_all(function(x){weighted.mean(x, .$B)})

Here the data frame has group column A, weight column B and other columns. I expect to have weighted averages of other columns values by column B for each group in A. I hope I can do this using dplyr and weighted.mean, but I am OK with other available methods.

zqin
  • 95
  • 10

1 Answers1

0

We dont' need .$ as .$ extract the whole column value instead of the values that corresponds to the grouping structure

df %>%
   group_by(A)%>%
   summarise_all(list(~ weighted.mean(., B)))

It can also be written without a lambda function (~) if we provide the parameters explicitly

df %>%
   group_by(A)%>%
   summarise_all(weighted.mean, w = B)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • It seems these answers no longer work. Thoughts? For example: this returns the x and w must gave the same length error the OP mentioned: mtcars %>% group_by(am)%>% summarise_all(list(~ weighted.mean(., hp))) – Rick Pack Jan 23 '23 at 16:36
  • 1
    @RickPack you may need `summarise(across(-B, ~ weighted.mean(.x, B)))` – akrun Jan 23 '23 at 17:07