2

I have a 31 CPU machine available for parallel computations. I would like to create a single 31-node cluster which would then serve for parallel computations to several different R processes. How can this be done?

I am currently using makeCluster in a way like this:

cl <- makeCluster(5)
registerDoParallel(cl)

but this will only serve the current R process. How can I connect to a cluster created in different R process?

PS: The reason why I want multiple processes to access one cluster is that I want to be constantly adding new sets of computations which will be waiting in the queue for the running processes to finish. I hope it will work this way? I have used doRedis for this in the past, but there were some problems and I would like to use a simple cluster for the purpose.

Tomas
  • 57,621
  • 49
  • 238
  • 373
  • You can't because PSOCK clusters communicate with R connections and those cannot be transferred to another R session. – HenrikB Mar 31 '20 at 17:26
  • @HenrikB thanks. You probably meant "you cant with `makeCluster()`, right? But I am also open to solution using different parallel backends, different functions/packages... – Tomas Mar 31 '20 at 18:01
  • Yes, `parallel::makeCluster()` creates PSOCK clusters. It might be that there are other types of workers that could achieve what you want but I don't know. Redis is certainly something I would invest time looking for this. It sounds like you're trying implement some type of work/job queue. Maybe it's just easier, and certainly more robust, to look into real job schedulers such as Slurm. The future.batchtools backend for futures supports that. – HenrikB Mar 31 '20 at 19:36

0 Answers0