group_by and summarise with differents functions to several columns

Question

If I have this dataframe:

(df=as.data.frame(dput(structure(list(sex = structure(c(1L, 1L, 2L, 2L), .Label = c("boy", "girl"), class = "factor"), age = c(52L, 58L, 40L, 62L), bmi = c(25L, 23L, 30L, 26L), chol = c(187L, 220L, 190L, 204L),sed = c(180L, 120L, 155L, 124L)), .Names = c("sex", "age", "b1", "b2","b100"), row.names = c(NA, -4L), class = "data.frame"))))

I want to group_by sex then apply différents functions in summarise() to differents columns:

calculate the "mean" of the column "age" (ONLY)

calculate the "sd" of all columns whose names begin with "b": column b1,b2...

I tried :

df%>%group_by(sex)%>%summarise_at(.vars = c("age",names(df)[substr(names(df),1,1)=="b"]),
                                            .funs = c(mean="mean", sd="sd"))

but It apply "mean" and "sd" functions to all columns, exactly what i want to avoid to.

the result that i want is a column: mean_age and others columns: sd_b1, sd_b2...

Is that possible with dplyr? or i must do it in two steps like:

df%>%group_by(sex)%>%summarise(mean_age=mean(age))

df%>%group_by(sex)%>%summarise_at(.vars = c(names(df)[substr(names(df),1,1)=="b"]),
                                            .funs = c(sd="sd"))

thank you

Please edit your question and fix your code, as it is producing errors. It might help to also include your expected output (and how this output is incorrect). — r2evans, Jun 03 '19 at 16:27
@r2evans I modify my question by deleting the code that i had tried , because it was totally no sense. Maybe it's less confusing now. — DD chen, Jun 03 '19 at 16:32
But now it's unclear what the difficulty is: it seems like a textbook case of `group_by` + `summarize_at`. — divibisan, Jun 03 '19 at 16:34
Is the problem that you're not selecting the variables to summarize correctly? The whole point of using these scoped dplyr functions is that you can use `?select_helpers` to select variables. So just: `vars(starts_with('b'))`. Take a look at `?summarize_at` and `?select_helpers` — divibisan, Jun 03 '19 at 16:43
@divibisan thank you, so in summarize_at(), the action (sum or mean) will be applied to all column selected. i can't for example apply sum to column 1-10 then mean to column 11-20? — DD chen, Jun 04 '19 at 08:40
Sure, just use different `summarize_at` statements for each distinct group of variables — divibisan, Jun 04 '19 at 14:42

group_by and summarise with differents functions to several columns

0 Answers0