3

Suppose I have a data frame:

set.seed(123)
dat<-data.frame(x=letters[1:9], 
                y=rep(LETTERS[1:3], each=3), 
                v1=rnorm(9,1,2),
                v2=rnorm(9,3,2),
                v3=rnorm(9,5,6))
dat
  x y         v1         v2         v3
1 a A -0.1209513  2.1086761  9.2081354
2 b A  0.5396450  5.4481636  2.1632516
3 c A  4.1174166  3.7196277 -1.4069422
4 d B  1.1410168  3.8015429  3.6921505
5 e B  1.2585755  3.2213654 -1.1560267
6 f B  4.4301300  1.8883177  0.6266526
7 g C  1.9218324  6.5738263  1.2497644
8 h C -1.5301225  3.9957010 -5.1201599
9 i C -0.3737057 -0.9332343 10.0267223

How to calculate the means for each group of y for columns v1 to v3?

  y       v1       v2       v3
1 A v1_meanA v2_meanA v3_meanA
2 B v1_meanB v2_meanB v3_meanB
3 C v1_meanC v2_meanC v3_meanC

I thought to use tidyverse::group_by(y) but not sure how to pass by summarise() for multiple columns.

David Z
  • 6,641
  • 11
  • 50
  • 101
  • 1
    Does this help? https://cran.r-project.org/web/packages/dplyr/vignettes/colwise.html –  Aug 04 '20 at 15:45

4 Answers4

4

Try this:

library(dplyr)
set.seed(123)
dat<-data.frame(x=letters[1:9], 
  y=rep(LETTERS[1:3], each=3), 
  v1=rnorm(9,1,2),
  v2=rnorm(9,3,2),
  v3=rnorm(9,5,6))
#Code
dat %>% select(-x) %>% group_by(y) %>% summarise_all(.funs = mean,na.rm=T)

# A tibble: 3 x 4
  y          v1    v2    v3
  <fct>   <dbl> <dbl> <dbl>
1 A     1.51     3.76  3.32
2 B     2.28     2.97  1.05
3 C     0.00600  3.21  2.05
Duck
  • 39,058
  • 13
  • 42
  • 84
3

The use of the summarize_all() and summarize_at() syntax has been superceded in dplyr 1.0.0. Per the vignette("colwise"), this seems to be the preferred approach:

library(dplyr)

dat %>% 
  group_by(y) %>% 
  summarize(across(v1:v3, mean))
1

Use summarize_at and vars.

want <- dat %>%
  group_by(y) %>%
  summarise_at(vars(v1, v2, v3), mean, na.rm = TRUE)
Reeza
  • 20,510
  • 4
  • 21
  • 38
  • The vars() documentation indicated it was superseded so I just replaced it but didn't test. Thanks for the catch. – Reeza Aug 04 '20 at 16:29
1
Wanted<- dat %>%
group_by(y)%>%
summarise(mean1=mean(v1), mean2 = mean(v2), mean3= mean(v3))
BeccaLi
  • 174
  • 7