Mean of triplicate

Question

I've just cleaned up a data frame that I scraped from an excel spreadsheet by amongst other things, removing percentage signs from some of the numbers see, Removing Percentages from a Data Frame.

The data has twenty four rows representing the parameters and results from eight experiments done in triplicate. Eg, what one would get from,

DF1 <- data.frame(X = 1:24, Y = 2 * (1:24), Z = 3 * (1:24))

I want to find the mean of each of the triplicates (which, fortunately are in sequential order) and create a new data frame with eight rows and the same amount of columns.

I tried to do this using,

DF2 <- data.frame(replicate(3,sapply(DF1, mean)))

which gave me the mean of each column as rows three times. I wanted to get a dataframe that would give me,

data.frame(X = c(2,5,8,11,14,17,20,23), Y = c(4,10,16,22,28,34,40,23), Z = c(6,15,24,33,42,51,60,69))

which I worked out by hand; it's supposed to be the reduced result.

Thanks, ...

Any help would be gratefully recieved.

have you looked at this? http://stackoverflow.com/questions/10945703/r-calculate-row-means-on-specific-columns — Rachel Gallen, Jan 18 '13 at 13:08
thanks for the link @Rachel, it is close to but not quite what I needed. — DarrenRhodes, Jan 18 '13 at 13:21

Tomas · Answer 1 · 2013-01-18T13:40:42.730

4

Nice task for codegolf!

aggregate(DF1, list(rep(1:8, each=3)), mean)[,-1]

to be more general, you should replace 8 with nrow(DF1).

... or, my favorite, using matrix multiplication:

t(t(DF1) %*% diag(8)[rep(1:8,each=3),]/3)

edited Jan 18 '13 at 13:40

answered Jan 18 '13 at 13:18

Tomas

57,621
49
238
373

Thanks for both of your answers. Does the first one return a data.frame and the second one return a matrix? I ask because the way that R returns the results looks slightly different. – DarrenRhodes Jan 18 '13 at 14:10
@user1945827, exactly. You can typecast them using `as.matrix` or `as.data.frame`. – Tomas Jan 18 '13 at 14:13
in the first answer, when I change '8' with nrow(DF1) I get an error returned. Don't know why, thought you might like to know, though. – DarrenRhodes Jan 18 '13 at 14:28
@user1945827 - works for me. What error you get? Is the DF1 still the original `data.frame` as you defined it in your question? – Tomas Jan 18 '13 at 14:33
" Is the DF1 still the original...". Ah, no. The other pieces of code provided by yourself and others is more than sufficient. I'll regenerate the error and post back here tomorrow. – DarrenRhodes Jan 18 '13 at 16:46

Stephan Kolassa · Accepted Answer · 2013-01-18T13:26:19.707

1

This works:

foo <- matrix(unlist(by(data=DF1,INDICES=rep(1:8,each=3),FUN=colMeans)),
  nrow=8,byrow=TRUE)
colnames(foo) <- colnames(DF1)

Look at ?by.

edited Jan 18 '13 at 13:26

answered Jan 18 '13 at 13:06

Stephan Kolassa

7,953
2
28
48

Hi @Stephan, your code almost works. I've lost my column headings. I tried the script again using 'data.frame' instead of 'matrix' but this returned a mess. I'll stick to your script and use 'names' to put the headers back if nothing else turns up. Thanks, – DarrenRhodes Jan 18 '13 at 13:20
I edited the code to add the `colnames`. But @Tomas' solution is much prettier, anyway, so +1 to him. – Stephan Kolassa Jan 18 '13 at 13:27

Mean of triplicate

2 Answers2