2

I tried a, as I came to see, quite memory intensive operation with R (write an xslx file with r of a dataset with 500k observations and 2000 variables).

I tried the method explained here. (First comment)

I set the max VSIZE to 10 GB, as I did not want to try more, because I was afraid to damage my computer (I saved money for a long time:)) and it still did not work.

I then looked up Cloud Computing with R, which I found to be quite difficult as well.

So finally, I wanted to ask here, if anyone could give me an answer on how much I can set the VSIZE without damaging my computer or if there is another way to solve my problem. (The goal is to transform an SAS file to an xslx or xsl file. The files are between 1.4 GB and 1.6 GB. My RAM is about 8GB big.) I am open to download programs if that's not too complicated.

Cheers.

eli-k
  • 10,898
  • 11
  • 40
  • 44
  • I don't think this should be a big drama. You can use the `haven` package to read `.sas7bdat` files into R and the `writexl` package to put them back out again. If you're open to a command-line alternative there is also readstat - https://github.com/WizardMac/ReadStat – thelatemail Oct 25 '18 at 21:41
  • Which OS are you using? 32-bit or 64-bit? Which R version? – Tung Oct 25 '18 at 21:44
  • 2
    One simple solution would be to just write it to a csv (which should not take that much memory) then open it in Excel and save it as an xslx file. Some of the excel packages are quite memory heavy. – Ian Wesley Oct 25 '18 at 22:08
  • 3
    I don't think it will fit. Each numeric entry takes 10 bytes. `500000*2000*10` returns `[1] 1e+10` which is comparable to all of your memory. You need to find a different route. You will need to say what you are referring to when you talk about "VSIZE". There won't be any damage. R will probably just refuse after it tries to allocate too much memory. – IRTFM Oct 25 '18 at 22:12
  • @Tung 64-bit. R-Version: 3.5.1 –  Oct 25 '18 at 22:55
  • Would it work if you try your script on other computers that have 12, 16 or 32GB RAM? – Tung Oct 25 '18 at 23:52
  • 1
    32 gig might work. The rule of thumb is at least three times the size of the largest object. – IRTFM Oct 26 '18 at 00:56
  • If your usage is learning, or academic, you can use SAS UE, which would do this very very quickly. As in a few minutes, though the installation would take longer :). – Reeza Oct 26 '18 at 01:19
  • @42 I've hard the 3X for SAS datasets sorting, does that generalize to R, or is that a general computer rule of thumb. – Reeza Oct 26 '18 at 01:20
  • R copies any object when it is changed, sometimes more than once. One needs to think about having adequate _contiguous_ memory available. Best to start up fresh OS and R sessions with no other applications open. – IRTFM Oct 26 '18 at 01:28

0 Answers0