5

I am learning R, and I promise you I have searched high and low for an answer to this. It is so simple, but for some reason I cannot figure it out for the life of me!

I have a dataframe containing one numeric vector and two factors:

team.weight <- c(150,160,120,100) # player's weight
team.jersey <- factor(c("blue", "green", "blue", "blue")) # player's jersey color
team.sex <- factor(c("male", "female", "female", "male")) # player's sex
team <- data.frame(team.jersey, team.sex, team.weight)

I want to display a table (I forget what it is called) that shows the average weight of all players, that is, mean(team.weight), for each combination of levels for the two factor tables.

I can do this manually, but there has to be a better way!

mean(team.weight[c(team.jersey[1],team.sex[1])])
mean(team.weight[c(team.jersey[1],team.sex[2])])
mean(team.weight[c(team.jersey[1],team.sex[3])])
mean(team.weight[c(team.jersey[1],team.sex[4])])

mean(team.weight[c(team.jersey[2],team.sex[1])])
mean(team.weight[c(team.jersey[2],team.sex[2])])
mean(team.weight[c(team.jersey[2],team.sex[3])])
mean(team.weight[c(team.jersey[2],team.sex[4])])

mean(team.weight[c(team.jersey[3],team.sex[1])])
mean(team.weight[c(team.jersey[3],team.sex[2])])
mean(team.weight[c(team.jersey[3],team.sex[3])])
mean(team.weight[c(team.jersey[3],team.sex[4])])

mean(team.weight[c(team.jersey[4],team.sex[1])])
mean(team.weight[c(team.jersey[4],team.sex[2])])
mean(team.weight[c(team.jersey[4],team.sex[3])])
mean(team.weight[c(team.jersey[4],team.sex[4])])

Any help would be greatly appreciated. I know the answer is dumb, but I cannot understand what it is.

Metrics
  • 15,172
  • 7
  • 54
  • 83
  • You manual approach doesn't make sense to me. Mayve you want `aggregate(team.weight ~ team.jersey + team.sex, data=team, FUN=mean)`? – Roland Aug 25 '13 at 14:23
  • Sorry for the confusion. What I am trying to answer is this: What is the average weight for each: blue/male, blue/female, green/male, and green/female? – Ludger Lassen Aug 25 '13 at 14:32

2 Answers2

3
tapply(team.weight, list(team$team.jersey, team$team.sex), mean)
#       female male
# blue     120  125
# green    160   NA
Julius Vainora
  • 47,421
  • 9
  • 90
  • 102
  • Good one! Is there a more basic approach to this? – Ferdinand.kraft Aug 25 '13 at 14:32
  • This looks perfect! So, just for my own edification, what that code is saying in plain English is: apply mean to team.weight, but break up doing so by BOTH team.jersey and team.sex? – Ludger Lassen Aug 25 '13 at 14:40
  • @LudgerLassen, exactly, see `?tapply` for some more examples and [this](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) question for some useful information. @Ferdinand.kraft, I couldn't come up with a more basic one, but this one uses base R and I believe is quite concise too. – Julius Vainora Aug 25 '13 at 14:45
  • @LudgerLassen, note that in this case there could be just `list(team.jersey, team.sex)` instead of `list(team$team.jersey, team$team.sex)` to make it even shorter, since you already have such vectors. – Julius Vainora Aug 25 '13 at 14:46
2

Here is a plyr example:

> library(plyr)
> ddply(team,.(team.jersey,team.sex),summarize,avgWeight=mean(team.weight))
  team.jersey team.sex avgWeight
1        blue   female       120
2        blue     male       125
3       green   female       160
David
  • 9,284
  • 3
  • 41
  • 40