If I have this dataframe:
(df=as.data.frame(dput(structure(list(sex = structure(c(1L, 1L, 2L, 2L), .Label = c("boy", "girl"), class = "factor"), age = c(52L, 58L, 40L, 62L), bmi = c(25L, 23L, 30L, 26L), chol = c(187L, 220L, 190L, 204L),sed = c(180L, 120L, 155L, 124L)), .Names = c("sex", "age", "b1", "b2","b100"), row.names = c(NA, -4L), class = "data.frame"))))
I want to group_by sex then apply différents functions in summarise() to differents columns:
calculate the "mean" of the column "age" (ONLY)
calculate the "sd" of all columns whose names begin with "b": column b1,b2...
I tried :
df%>%group_by(sex)%>%summarise_at(.vars = c("age",names(df)[substr(names(df),1,1)=="b"]),
.funs = c(mean="mean", sd="sd"))
but It apply "mean" and "sd" functions to all columns, exactly what i want to avoid to.
the result that i want is a column: mean_age and others columns: sd_b1, sd_b2...
Is that possible with dplyr? or i must do it in two steps like:
df%>%group_by(sex)%>%summarise(mean_age=mean(age))
df%>%group_by(sex)%>%summarise_at(.vars = c(names(df)[substr(names(df),1,1)=="b"]),
.funs = c(sd="sd"))
thank you