0

i have looked here and here also and a few other places, and i have some troubles.

my code works with very large data files and creates even larger variables. for example: i work with a 3d matrix with dimensions n * n * j, where 1400 < n < 1500 and 150000 < j < 750000. as a single 3d matrix it goes to about ~10Tb of memory. i'm now working on a vector(mode="list", length=j), and a 2d matrix with dimensions n*n at every position. it works, but it is too big. way too big.

i am trying to divide the vector into memory friendly lengths, and save it into a file. after checking the sizes and memory use of the variables here, i know my resources can handle a vector with length~10000. i want to save the vector (with the different matrices inside) into a file and read the file later in order to get the variable for later use.

any idea? any better way to store the data in the code?

p.s. RAM and disk space are not an issue here. up to about 60 GB RAM and 1TB of disk space (flexible). p.s.s. the code takes few days to run, so i will take any idea, even if it takes lots of time.

thanks! whitestorm

adding an example of the vector that i want to export and import later:

n <- 1500

j <- 10000

p <- vector(mode="list", length=j)

for (i in 1:j) {p[[i]] <- matrix(runif(n*n), ncol=n)}

# save the vector p to file and continue to another 10000 (i in j+1:2*j)

# after finishing 10 to 100 repeats, read it one by one and use it.

  • `save`, `saveRDS`, and `jsonlite::toJSON` are all good examples, depending on your needs. There are likely many others. – r2evans Dec 08 '19 at 14:38
  • thanks! looks like saveRDS and readRDS will do the job for me! works perfectly. after checking it, its not that good. it works, BUT: will result in about 8TB of disk space, and about 3-4 more weeks of saving and reading time. (using 1 core with 60gb RAM) any other ideas? – whitestorn Dec 11 '19 at 08:51
  • `save` and `saveRDS` are (give or take) the best compression you're going to get in a real-time-readable compressed format. If you're complaining about 8TB of disk space in those formats ... then you really need to address your data size problem. R is not going to make that simpler. – r2evans Dec 11 '19 at 13:14

0 Answers0