1

I've been experimenting with parallel computing in R using the doParallel package. From what I understand, depending on how you call registerDoParallel, you might use the multicore API or the snow API.

For example, to make use of forking with multicore, you'd do something like this:

registerDoParallel(4)

And to make use of forking with snow, you'd do something like this:

cl <- makeCluster(4, type="FORK")
registerDoParallel(cl)

I have a fairly simple job* that is calculating summary statistics on a matrix object. However, even though both of these approaches ostensibly make use of forking (and thus all child jobs should have access to the same address space for working with this matrix), I'm getting wildly different performance/memory usage. (In particular, multicore works substantially better.)

Before I try to diagnose this further, I thought I'd ask: Does anyone have an intuitive explanation for why this would be happening, and if this is normal? I've been searching everywhere for a resource that breaks down the differences between the two, but I've been unable to find anything that breaks it down in detail. Are there cases where, say, the multicore version would run and the snow version wouldn't (due to memory issues)?

(Note: This is all on a Mac with 16 GB of RAM. I did read these threads -- Does multicore computing using R's doParallel package use more memory? and reading global variables using foreach in R -- which almost address my concern but not quite.)

*I'm happy to provide my code, but it didn't seem relevant for this example. I'm hoping for a more general breakdown of the differences between these two approaches.

dd9000
  • 11
  • 2
  • I have a small section on this in my guide to parallelism with foreach: https://privefl.github.io/blog/a-guide-to-parallelism-in-r/#using-clusters. You should maybe provide your code as it may be a problem that you experience when using big matrix objects only, and many people might not be familiar with them here. – F. Privé Jul 08 '19 at 05:35
  • Thank you so much for the response! I realize now that my question was unclear. My understanding is that I am "forking" in both cases (not using "PSOCK"). However, when I'm "FORK"ing with the snow backend, things run slowly; when I'm using the multicore backend, things run quickly. I was confused about why the gap would be so substantial, and if there were general rules of thumb for thinking about one approach versus another. (Edited to make this all more clear.) That said, I'll clean up my code and try to get it ready for sharing. – dd9000 Jul 08 '19 at 19:09
  • Hum, that's weird. Are you sure it's not a problem of caching (first run is slow and second run is fast)? Also, you might want to try the {future} API too. – F. Privé Jul 09 '19 at 06:25

0 Answers0