12

After running several models I need to run a system() command on my R script to shutdown my EC2 instance, but when I get to that point I get:

cannot popen 'ls', probable reason 'Cannot allocate memory'

Note: for this question I even tried ls which did not work

The flow of my script is the following

  • Load Model (about 2GB)
  • Mine documents and write to a MySQL database

The above steps are repeated around 20 times with different models with an average size of 2GB each

  • Terminate the instance

At this point is when I need to call system("sudo shutdown -h now") and nothing happens, but when I try system("sudo shutdown -h now",intern=TRUE) I get the allocation error.

I tried rm() for all my objects just before calling the shutdown, but the same error persists.

Here is some data on my system which is a large EC2 Ubuntu instance

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] RTextTools_1.3.9   tau_0.0-15         glmnet_1.8         Matrix_1.0-6      
 [5] lattice_0.20-10    maxent_1.3.2       Rcpp_0.9.13        caTools_1.13      
 [9] bitops_1.0-4.1     ipred_0.8-13       prodlim_1.3.2      KernSmooth_2.23-8 
[13] survival_2.36-14   mlbench_2.1-1      MASS_7.3-21        rpart_3.1-54      
[17] e1071_1.6-1        class_7.3-4        tm_0.5-7.3         nnet_7.3-4        
[21] tree_1.0-31        randomForest_4.6-6 SparseM_0.96       RMySQL_0.9-3      
[25] ggplot2_0.9.1      DBI_0.2-5         

loaded via a namespace (and not attached):
 [1] colorspace_1.1-2   dichromat_1.2-4    digest_0.5.2       grid_2.15.1       
 [5] labeling_0.2       memoise_0.1        munsell_0.3        plyr_1.7.1        
 [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1     scales_0.2.1      
[13] slam_0.1-25        stringr_0.6.1    

gc() returns

          used (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells 1143171 61.1    5234604  279.6   5268036  281.4
Vcells 1055057  8.1  465891772 3554.5 767962930 5859.1

I noticed that if I run just 1 model instead of the 20 it works fine, so it might be that memory is not getting free after each run although I did rm() the used objects

I also noticed that if I close R and restart it and then call system() it works. If there is a way to restart R within R then maybe I can add that to my script.sh flow.

Which would be the appropriate way of cleaning all of my objects and letting the memory free for each loop so when I need to call the system() commands there is no memory issue?

Any tip in the right direction will be much appreciated! Thanks

JordanBelf
  • 3,208
  • 9
  • 47
  • 80
  • after `rm()`, do `gc()` to force garbage collection and see if that helps. – GSee Sep 07 '12 at 17:49
  • 3
    You can restart **R** like this: `assign(".Last", function() system("R"), pos=.GlobalEnv); q("no")` – GSee Sep 07 '12 at 17:52
  • Thanks a lot! I am trying both options right now. I guess that using the restarting option will solve my problem though probably not the most elegant way – JordanBelf Sep 07 '12 at 18:04
  • 2
    apparently restarting R kills everything below the `q("no")` line so I cant complete the execution of the script. I will try with the other option – JordanBelf Sep 07 '12 at 18:43
  • I had the same issue on EC2, restarting R '''fixed it'''. – Konstantinos Jun 14 '16 at 20:05

1 Answers1

10

I'm just posting this because it's too long to fit in the comments. Since you haven't included any code, it's pretty hard to give advice. But, here is some code that maybe you can think about.

wd <- getwd()
assign('.First', function(x) {
  require('plyr') #and whatever other packages you're using
  file.remove(".RData") #already been loaded
  rm(".Last", pos=.GlobalEnv) #otherwise won't be able to quit R without it restarting
  setwd(wd)
}, pos=.GlobalEnv)
assign(".Last", function() {
  system("R --no-site-file --no-init-file --quiet")
}, pos=.GlobalEnv)
save.image() #or only save the things you want to be reloaded.
q("no")

The idea is that you save the things you need in a file called .RData. You create a .Last function that will be run when you quit R. The .Last function will start a new session of R. And you create a .First function that will be run as soon as R is restarted. The .First function will load packages you need and clean up.

Now, you can quit R and it will restart loading the things you need.

(q("no") means don't save, but you already saved everything you need in .RData which will be loaded when it restarts)

GSee
  • 48,880
  • 13
  • 125
  • 145
  • Thats a really interesting approach, I will be making modification to my code to work with your approach and see if it works though if it "loads from where I left" I guess it will work fine. Thanks a lot for all your time helping me. – JordanBelf Sep 07 '12 at 19:18
  • I think `gc()` is more likely to be your ticket – GSee Sep 07 '12 at 19:20
  • On my first test I can confirm this answer works fine! and unfortunately using `rm(list = ls())` `gc()` before calling the `system()` function did not. I have yet to test if calling `rm` and `gc` at the end of each loop works, maybe that will free memory and somehow arrive to `system` with more RAM available – JordanBelf Sep 07 '12 at 19:48
  • this is a very interesting approach for those hopeless situations where no number of gc() calls will help due to fragmentation/whatever. – vc273 Mar 12 '14 at 15:49
  • 1
    Might like to add `wait = FALSE` to the `system` call to avoid leaving the old process (and its memory) hanging in the operating system. I created a memory stable loop with this approach (with Rscript in the `system` call). – Morten Grum Apr 01 '20 at 05:22