2

I'm trying to append a month's worth of data (200k) rows to a data.frame that is already 16m rows, and am hitting the R memory limit on my system:

d = rbind(d, n)
Error: cannot allocate vector of size 60.8 Mb
In addition: Warning messages:
1: In rbind(deparse.level, ...) :
  Reached total allocation of 8072Mb: see help(memory.size)

memory.size and memory.max report 2187.88 and 8072 respectively, so I think i'm using all my 8GB system memory. Using an object memory reporting function detailed by JD Long in this question, I get the following report:

            Type          Size     Rows Columns
d     data.table 2,231,877,576 15941535      26
files  character           912       13      NA
i        numeric            48        1      NA
n     data.frame    28,176,000   213116      26

Is there another way to append to a data.frame that does not entail the apparent object duplication that seems to be taking place and eating up memory? I am keen to avoid appending to csv files because I'm working with .RData saved objects for quicker data reading-in.

Community
  • 1
  • 1
geotheory
  • 22,624
  • 29
  • 119
  • 196

1 Answers1

3

If you are using data.table objects you should use rbindlist to avoid making unnecessary copies of your data.table. This should work...

d = rbindlist(d, n)
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184