7

I have a bunch of data frames with varying degrees of length, ranging from approx. 15,000 to 500,000. For each of these data frames, I would like to split them up into smaller data frames each with 300 rows which I would do further processing on. How can I do this?

This (Split up a dataframe by number of rows) provides a partial answer, but it doesn't work because not all my data frames have length that are multiples of 300.

Would greatly appreciate it if a plyr and non-plyr solution can both be provided.

Thank you!

Community
  • 1
  • 1
mchangun
  • 9,814
  • 18
  • 71
  • 101

2 Answers2

17

I don't understand why a plyr solution is needed. split works perfectly well and even hadley himself didn't suggest a plyr/reshape2 solution when he looked at the earlier question:

split(dfrm, (0:nrow(dfrm) %/% 300)  # modulo division

Does produce a warning but since you were expecting a non-evenly divisible result you should ignore it.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
2

Something like the following may help

numBreaks <- nrow(DAT)%/%300 + 1
for( i in seq(numBreaks)){
  smallDAT <- DAT[((i-1)*300+1):(min(nrow(DAT), i*300)), ]
.....
}
user1609452
  • 4,406
  • 1
  • 15
  • 20