-1

I have a data frame with 84 rows and 48 columns and want to calculate for each 4 consecutive columns at every each 7 rows the following statistics: sum sum min max each corresponding to a column, and then jump to the other 4 columns of the 48 columns of the data.frame.

I have found a StackOverflow post already, but it didn't work for all my data.frame. It just worked for each column and it only does one statistics per time per column.

v=dataset$count
n = 7
sidx = seq.int(from=1, to=length(v), by=n)
eidx = c((sidx-1)[2:length(sidx)], length(v))
thesum = sapply(1:length(sidx), function(i) sum(v[sidx[i]:eidx[i]]))
thesum
 [1] 10957 10955 10953 10955 10954 10955 10957 10956 10958 10953 10954    10956
camille
  • 16,432
  • 18
  • 38
  • 60
Gab
  • 175
  • 1
  • 1
  • 6

2 Answers2

0

I am not sure I exactly follow your requirements - but you can use indexing in looping. This loop takes summary statistics for 7 rows, by every second column.

#making example data
ir <- iris[ 1:84 , 1:4]
ir <- do.call(cbind,  rep( ir, 12))

# this is the size you specfied
dim( ir )

FINAL <- NULL

# For every set of seven rows
for( i in seq( 1 , nrow( ir) , 7 ) ){
# For every set of four columns
OUT <- NULL
    for( j in seq( 1 , ncol( ir) , 4 ) ){


      out <- cbind(
        sum1 =  sum(  ir[ i:(i+6) ,  j ]  ),
        sum2 =  sum(  ir[ i:(i+6) ,  j+1 ]  ),
        min1 =  min(  ir[ i:(i+6) ,  j+2 ]  ),
        max1 =  max(  ir[ i:(i+6) ,  j+3 ]  )
      )

     OUT <- cbind( OUT , out )

}

    FINAL <- rbind( OUT , FINAL)
}

#output object match your specification
dim( FINAL )
MatthewR
  • 2,660
  • 5
  • 26
  • 37
  • please my explanation below – Gab May 04 '19 at 00:31
  • i need for a data frame of 84 rows and 48 columns blocks of 7 row elements every four columns this statics: first column: "sum", second column: "sum", third column "min", fourth column:" max" in that order till reaching the 48th column ... – Gab May 04 '19 at 01:01
  • @Gab - why do you have two sums? also what does the end output look like? Are you running these four stats on 48 cols, or 12 groups of 4 cols. – MatthewR May 04 '19 at 07:33
  • I think I might have not explain it correctly: I have a data frame of 84 row and 48 cols and need for each 4 columns perform the following repetition: take each 7 row elements of the first col and do sum; take each 7 row elements of the second col and do sum; take each 7 row elements of the third col and do min; take each 7 row elements of the fourth col and do max; all this for the 48 cols. Finally I should get a data frame of 12 (84 rows/ 7 rows) rows and 48 cols. – Gab May 04 '19 at 15:59
  • I think is nice code but there a difference in the results between yours and the one I have just now put above .... I think in your or in mine one is not taking the 7th row elements for each column to make the statistics... let me know what you think? – Gab May 05 '19 at 20:48
0

I went also by combining codes from several places, in different way as follows and worked out well:

n = 7
sidx = seq.int(from=1, to=nrow(dataset), by=n)
eidx = c((sidx-1)[2:length(sidx)], nrow(dataset))
# cerate a data frame
k=data.frame(matrix(nrow = 12,ncol = 48))

for (i in 1:12){
   for(j in 1:12){
      k[i,(4*j)-3]=apply(dataset[sidx[i]:eidx[i],(4*j)-1],2, sum)
      k[i,(4*j)-2]=apply(dataset[sidx[i]:eidx[i],(4*j)], 2,sum)
      k[i,(4*j)-1]=apply(dataset[sidx[i]:eidx[i],(4*j)+1], 2,min)
      k[i,(4*j)]=apply(dataset[sidx[i]:eidx[i],(4*j)+2], 2,max)
 }
}
View(k)
Gab
  • 175
  • 1
  • 1
  • 6