Finding mean of every x observations for list of dataframes

Question

I'm trying to follow this SO post: Calculate the mean of every 13 rows in data frame, but for some reason it's not working correctly on my end. Their example works fine:

df <- data.frame(a=1:12, b=13:24 );
df
n <- 5;
aggregate(df,list(rep(1:(nrow(df)%/%n+1),each=n,len=nrow(df))),mean)[-1];

     a    b
1  3.0 15.0
2  8.0 20.0
3 11.5 23.5

But mine, using a for loop for over a list of dfs, doesnt:

for (dset in 1:5){
  if(dset == 1){n <- 60}
  else{n <- 12}#else combine by 12
  print(n)
  v.ntrade <- aggregate(B.list[[dset]][,7],list(rep(1:(nrow(B.list[[dset]][,7])%/%n+1),each=n,len=nrow(B.list[[dset]][,7]))),sum)
  v.volume <- aggregate(B.list[[dset]][,5],list(rep(1:(nrow(B.list[[dset]][,5])%/%n+1),each=n,len=nrow(B.list[[dset]][,5]))),sum)

  B.list[[dset]] <- aggregate(B.list[[dset]],list(rep(1:(nrow(B.list[[dset]])%/%n+1),each=n,len=nrow(B.list[[dset]]))),mean)
  #replace vol and ntrades
  B.list[[dset]][,7] <- v.ntrade[,2]
  B.list[[dset]][,5] <- v.volume[,2]
  B.list[[dset]] <- B.list[[dset]][,-1]    }

Before:

> B.list[[1]][,4]
       PAIRclose
    1:   8063.21
    2:   8065.95
    3:   8053.50
    4:   8040.00
    5:   8054.00
   ---          
75009:   7471.40
75010:   7461.99
75011:   7472.56
75012:   7482.05
75013:   7469.69

After:

> B.list[[1]][,4]
   [1] 5698.0203 2257.8796 2886.9289 1812.9951 1521.3267 2305.9228 1103.6083

etc

Is there some weird behavior with the aggregate function? Or is it something with the %/%n+1 that I have no idea what it does.

You need to provide a reproducible example of your data for us to work upon. — Ronak Shah, Jul 09 '18 at 06:23
Please share sample of your data using `dput()` (not `str` or `head` or picture/screenshot) so others can help. See more here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 — Tung, Jul 09 '18 at 06:24

akrun · Answer 1 · 2018-07-09T06:33:50.373

We can do this with tidyverse. Loop through the list of datasets with map, create a grouping variable with gl and use summarise_all to get the mean of all other columns

library(tidyverse)
lst %>% 
    map(~ .x %>%
            group_by(grp = as.integer(gl(n(), n, n()))) %>% 
            summarise_all(mean))
#[[1]]
# A tibble: 3 x 3
#    grp     a     b
#  <int> <dbl> <dbl>
#1     1   3    15  
#2     2   8    20  
#3     3  11.5  23.5

#[[2]]
# A tibble: 3 x 3
#    grp     a     b
#  <int> <dbl> <dbl>
#1     1   3    15  
#2     2   8    20  
#3     3  11.5  23.5

Or using base R with lapply and aggregate

lapply(lst, function(x) aggregate(.~ cbind(grp = as.integer(gl(nrow(x),
          n, nrow(x)))), x, mean)[-1])

data

lst <- list(df, df)

Finding mean of every x observations for list of dataframes

1 Answers1

data