1

I am relatively new to R and have a complicated situation to solve. I have uploaded a list of over 1000 data frames into R and called this list x. What I want to do is take certain data frames and take the mean and variance of the entire data frames (excluding the first column of each) and save these into two separate vectors. For example I wish to take the mean and variance of every third data frame in the list starting from element (3) and going to element (54).

So what I ultimately want are two vectors:

meanvector=c(mean(data frame(3)), mean(data frame(6)),..., mean(data frame(54)))
variancevector=c(var(data frame (3)), var(data frame (6)), ..., var(data frame(54)))

This problem is way above my knowledge level but I am thinking I can do this effectively using some sort of loop but I do not know how to go about making such loop. Any help would be much appreciated! Thank you in advance.

juba
  • 47,631
  • 14
  • 113
  • 118
user1836894
  • 293
  • 2
  • 5
  • 18
  • when you mean `entire data.frame excluding the first column`, do you mean `means of all other columns separately` (or) `one mean value of entire data.frame with the first column removed`? – Arun Feb 25 '13 at 21:13
  • I meant one mean value of entire data.frame with the first column removed. – user1836894 Feb 25 '13 at 21:33
  • Good, then I've understood it rightly. My answer should work. – Arun Feb 25 '13 at 21:34

2 Answers2

3

You can use lapply and pass indices as follows:

ids <- seq(3, 54, by=3)
out <- do.call(rbind, lapply(ids, function(idx) {
    t <- unlist(x[[idx]][, -1])
    c(mean(t), var(t))
}))
Arun
  • 116,683
  • 26
  • 284
  • 387
1

If x is a list of 1000 dataframes, you can use lapply to return the means and variances of a subset of this list.

ix = seq(1, 1000, 3)
lapply(x[ix], function(df){
    #exclude the first column
    c(mean(df[,-1]), var(df[,-1]))
})
kith
  • 5,486
  • 1
  • 21
  • 21
  • I'm not sure if you can take the mean of a `df` directly without warning.. And that takes the means of every column as well, not the entire `data.frame` (iiuc). – Arun Feb 25 '13 at 21:10