-1

Say I have an "integer" factor vector of length 5:

vecFactor = c(1,3,2,2,3)

and another "integer" data vector of length 5:

vecData = c(1.3,4.5,6.7,3,2)

How can I find the average of the data in each factor, so that I would get a result of:

Factor 1: Average = 1.3
Factor 2: Average = 4.85
Factor 3: Average = 3.25

3 Answers3

1
 tapply(vecData, vecFactor, FUN=mean)
   1    2    3 
 1.30 4.85 3.25 
akrun
  • 874,273
  • 37
  • 540
  • 662
1

I sometimes use a linear model to do this instead of tapply, which is quite flexible (for instance if you need to add weights...). Don't forget the "-1" in the formula

lm(vecData~factor(vecFactor)-1)$coef

factor(vecFactor)1 factor(vecFactor)2 factor(vecFactor)3
        1.30               4.85               3.25
agenis
  • 8,069
  • 5
  • 53
  • 102
  • which also has the advantage of avoiding the "what's wrong? oh wait forgot the na.rm=T" part) – agenis Oct 29 '14 at 15:52
1

To get a good table, try aggregate function with data.frame:

ddf = data.frame(vecData, vecFactor)
aggregate(vecData~vecFactor, data=ddf, mean)
  vecFactor vecData
1         1    1.30
2         2    4.85
3         3    3.25

data.table can also be used for this:

library(data.table)    
ddt = data.table(ddf)
ddt[,list(meanval=mean(vecData)),by=vecFactor]
   vecFactor meanval
1:         1    1.30
2:         3    3.25
3:         2    4.85
rnso
  • 23,686
  • 25
  • 112
  • 234