1

I have a very large For Loop that I need to run in R (10 million rows). Would it generally be faster to loop through 5 million rows concurrently in two separate R windows or would it be faster to run the loop on all 10 million rows in a single R window?

Would two separate R windows take advantage of any sort of parallel processing?

For the purposes of this question, I do not want to use lapply.

M. T.
  • 31
  • 1
  • 1
    If each iteration can be done indepedently, then you should look at running them in `parallel` by leveraging multiple cores on your machine. `mcmapply` and `foreach` are very useful – Sonny Apr 26 '19 at 07:02
  • The advantage you'd get with running two different `R` windows is you can take up more RAM. If you hit your RAM limit with a single window, then yes, it would be faster to run it on two different windows. – boski Apr 26 '19 at 07:10
  • Can you share a minimal [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your data and code? – markus Apr 26 '19 at 07:24
  • 1
    I don't know if you're already well aware of this, but you might be able to make your R code much faster if it can be vectorized. http://www.noamross.net/blog/2014/4/16/vectorization-in-r--why.html or http://www.win-vector.com/blog/2019/01/what-does-it-mean-to-write-vectorized-code-in-r/ or – Jon Spring Apr 26 '19 at 07:25
  • 1
    Agree with Jon: first look for the possibility of vectorisation instead of for-looping. This can give you a MASSIVE speed-incread, and is (usually) a lot more more memory efficient. Only after that, you should look at other solutions like parallelisation etc. – Wimpel Apr 26 '19 at 08:25
  • I am aware of the benefits of vectorisation, but for the purposes of this question I am not interested in that. I just would like to know if a for loop will generally run faster in 1 window or if the task is split into 2 windows? – M. T. Apr 28 '19 at 19:08

0 Answers0