This seems to answer your question:
DATA:
df <- data.frame(
race = sample(c("black", "other", "white"), 100, replace = T),
gender = sample(c("f", "m"), 100, replace = T),
x = rnorm(100))
EDIT:
The most elegant and quickest way to achieve your goal is to compute the means not via aggregate
but via tapply
, thus:
tapply(df$x, list(df$race, df$gender), mean)
The output is pretty much what you want:
f m
black 0.13584749 0.1928834
other -0.37386003 0.2078025
white -0.09913409 -0.1672589
Other solution:
This computes the means by gender
and race
using aggregate
:
df1 <- aggregate(
df$x,
list(df$race, df$gender),
FUN = mean, na.rm = TRUE
)
df1
Group.1 Group.2 x
1 black f 0.13584749
2 other f -0.37386003
3 white f -0.09913409
4 black m 0.19288337
5 other m 0.20780250
6 white m -0.16725892
To get more meaningful column names rename the columns:
colnames(df1) <- c("race", "gender", "x")
Change df1
to wide format using reshape
:
reshape(df1, idvar = "race", timevar = "gender", direction = "wide")
race x.f x.m
1 black 0.13584749 0.1928834
2 other -0.37386003 0.2078025
3 white -0.09913409 -0.1672589