I have a several GB dataset that I want to read into R in chunks, make some transformations and export to Vowpal Wabbit format. To do that I read a DT
by fread
, call couple functions and set DT <- NULL
, call garbage collector gc()
and repeat the process. However, this still causes memory to be almost at its maximum level and makes the process slow (couple hours).
Based on Tricks to manage the available memory in an R session I wonder if there is a way to update DT from fread (not DT <- fread()
, but via := statement) - this way there would not be a sequence of objects. Or do You have any other suggestion?