1

I have the following data.frame:

> test
  a b  c
1 1 4 10
2 1 5 11
3 2 6 12
4 2 7 14
5 2 8 15
6 8 9 15

I'd like to write a for loop which will calculate the mean of vector b for each value in vector a. I'd therefore like the following output:

> average
    1   2   8
[1] 4.5 7.0 9.0

My attemp so far

subset<-data.frame()
average<-vector(mode="numeric")
for (i in 1:length(test$a)) {
  subset<-subset(test,test$a==test$a[i])
  average[i]<-mean(subset$b)
}

However, I get the following result

> average
[1] 4.5 4.5 7.0 7.0 7.0 9.0

This should be fairly easy but I unfortunately do not seem to manage it.

Could you please help me out?

Thank you very much in advance.

panajach
  • 105
  • 1
  • 4
  • 13

2 Answers2

1

You could try this with data.table

library(data.table)
setDT(test)
test[, mean (b), by = a]
a  V1
1: 1 4.5
2: 2 7.0
3: 8 9.0
erasmortg
  • 3,246
  • 1
  • 17
  • 34
1

One line in base R...

tapply(test$b,test$a,mean)

  1   2   8 
4.5 7.0 9.0

By the way, your code does not work because you are looping over each element of test$a, even duplicated values, rather than just over elements of unique(test$a).

Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32