0

I have what I fear may be a simple problem, to which I almost have the solution (indeed, I do have a solution, but it's clumsy).

I have a data frame as follows:

name    replicate   value
A   1   0.9
A   2   1
B   1   0.8
B   2   0.81
C   1   0.7
C   2   0.9

What I would like to do is compute the mean of 'value', by 'name', and append the results to a new column as follows:

name    replicate   value   meanbyname
A   1   0.9 0.95
A   2   1   0.95
B   1   0.8 0.805
B   2   0.81    0.805
C   1   0.7 0.8
C   2   0.9 0.8

I can calculate the means in any of the following ways:

a<-aggregate(value~name, data=test, FUN=function(x) c(mean=mean(x),count=length(x)))
b<-aggregate(test$value~test$name, FUN=mean)
c<-tapply(test$value, test$name, mean)

but I cannot append them easily to the data frame as they are the wrong length.

I could then do this:

 test$meanbyname<-rep(c, each=2)

This seems close, by gives an error as object 'a' seems to only be two columns wide:

  test$meanbyname<-rep(a$value.mean, each=a$value.count)

I'd like a way of automating the process so it will work if there are, for example, three replicates of name=A and only one of name=B. Could there be a one line solution which is more generalisable?

Thank you all in advance for your help.

looble
  • 13
  • 2

1 Answers1

1

You could use ave from base R

 test$meanbyname <- with(test, ave(value, name))

Or using mutate from dplyr or := in data.table, can get the results

i.e.

 library(dplyr)
 group_by(test, name) %>% 
               mutate(meanbyname=mean(value))

Or

 library(data.table)
 setDT(test)[, meanbyname:= mean(value), by=name]
akrun
  • 874,273
  • 37
  • 540
  • 662