Average every Nth column in data frame by group in dataframe

Question

I have a data.frame list. I to average every n days in each df.

I am trying to lapply over my list.

test<-lapply(dataframe_list, function(d){ 
  n <- 14
  aggregate(d,list(rep(1:(nrow(d)%/%n+1),each=n,len=nrow(d))),mean)[-1]
  d
            } 
            )

But I get warnings:

Warning messages: 1: In mean.default(X[[1L]], ...) : argument is not numeric or logical: returning NA 2: In mean.default(X[[2L]], ...) : argument is not numeric or logical: returning NA 3: In mean.default(X[[3L]], ...) : argument is not numeric or logical: returning NA

Here is the result of head(df) on one of the df in the list:

KGID 3MEHIS ACE_POT ADD_SUG_AVAIL_CHO ADD_SUG_TOT_SUG   ALA ALCOHOL PROTEIN_AN
1 KGID      0       0             3.135               0 1.848       0     24.181
2 KGID      0       0             3.135               0 1.848       0     24.181
3 KGID      0       0             3.135               0 1.848       0     24.181
4 KGID      0       0             3.135               0 1.848       0     24.181
5 KGID      0       0             3.135               0 1.848       0     24.181
6 KGID      0       0             3.135               0 1.848       0     24.181

Ultimately, I would like to see an average for the first 14 rows for this df, of course, the first column can't have an average. Is that my problem?

Please show a small example dataset 5-10 columns with 5-10 rows and expected result based on that. The description is confusing. For guidelines, check [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — akrun, Jul 14 '15 at 11:50
Right now I have a list of dataframes. I am trying to lapply over the dataframe_list with a function that averages every 14 rows together for each dataframe in the list, and then call that new dataframe list "test". `test<-lapply(dataframe_list, function(d){ n <- 14 aggregate(d,list(rep(1:(nrow(d)%/%n+1),each=n,len=nrow(d))),mean)[-1] d } )` — user3795577, Jul 14 '15 at 13:54
That is fine. But, can you show the example of a single dataset with 5-10 columns and 10 rows and its expected output, so that we can extend it to the list — akrun, Jul 14 '15 at 13:56
The problem is that the code throws warnings. 'Warning messages: 1: In mean.default(X[[1L]], ...) : argument is not numeric or logical: returning NA 2: In mean.default(X[[2L]], ...) : argument is not numeric or logical: returning NA 3: In mean.default(X[[3L]], ...) : argument is not numeric or logical: returning NA' at least 50 of them. Here is the result of head(df): — user3795577, Jul 14 '15 at 13:58
BTW, In your example, there are character columns. So that may be the reason you have warnings. Subset the numeric columns and then do the mean — akrun, Jul 14 '15 at 14:03

akrun · Accepted Answer · 2015-07-14T14:14:52.660

The warning message suggest that you included non-numeric columns in the mean calculation. In the example showed, it is the first column that is non-numeric. We can remove the first column by d1[-1] and then create a new grouping column ('grp') and use the formula method in aggregate to get the 'mean'. Generally, if there are many non-numeric columns, we can create a logical condition (sapply(d1, is.numeric)) to subset the dataset with only numeric columns.

 n <- 14
 aggregate(.~grp, transform(d1[sapply(d1, is.numeric)],
       grp= as.numeric(gl(nrow(d1), n, nrow(d1)))), 
              FUN=mean, na.rm=TRUE, na.action=NULL)
 #   grp X3MEHIS ACE_POT ADD_SUG_AVAIL_CHO ADD_SUG_TOT_SUG   ALA ALCOHOL
 #1   1       0       0             3.135               0 1.848       0
 #  PROTEIN_AN
 #1     24.181

and for the list of dataframes, we can loop the same code with lapply.

 lapply(dataframe_list, function(d) aggregate(.~grp, 
       transform(d[sapply(d, is.numeric)],
            grp= as.numeric(gl(nrow(d), n, nrow(d)))),
              FUN=mean, na.rm=TRUE, na.action=NULL))

This solution provided the output I was looking for. Thank you akrun. — user3795577, Jul 14 '15 at 14:18

Average every Nth column in data frame by group in dataframe

1 Answers1