I would like to compute summary variables from multiple columns in a data frame. This is possible when typing out all the row names, but I would like to use starts_with() and similar functions. I.e.
df <- data.frame(A1 = rnorm(100, 0, 1),
A2 = rnorm(100, 0, 1),
A3 = rnorm(100, 0, 1),
B1 = rnorm(100, 0, 1),
B2 = rnorm(100, 0, 1))
What works:
library(tidyverse)
df %>% mutate(A = (A1 + A2 + A3)/3)
df %>% mutate(A = rowMeans(select(., A1:A3)))
However, the former gets annoying when summarising many variables, while the latter gets incredibly slow very quickly when summarising many rows. I suspect there must be a faster solution.
What does not work:
df %>% mutate(A = mean(A1:A3))
df %>% group_by(row_number()) %>% mutate(A = mean(A1:A3))
df %>% group_by(row_number()) %>% mutate(A = mean(starts_with("A")))
So my question is: Is there a way to use mean() etc. within mutate() to compute row means, ideally without having to spell out every single variable?