7

I'm attempting to run a parallel job in R using snow. I've been able to run extremely similar jobs with no trouble on older versions of R and snow. R package dependencies prevent me from reverting.

What happens: My jobs terminate at the parRapply step, i.e., the first time the nodes have to do anything short of reporting Sys.info(). The error message reads:

Error in checkForRemoteErrors(val) : 
3 nodes produced errors; first error: cannot open the connection 
Calls: parRapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors

Specs: R 2.14.0, snow 0.3-8, RedHat Enterprise Linux Client release 5.6. The snow package has been built on the correct version of R.

Details: The following code appears to execute fine:

cl <- makeCluster(3)
clusterEvalQ(cl,library(deSolve,lib="~/R/library"))
clusterCall(cl,function() Sys.info()[c("nodename","machine")])

I'm an end-user, not a system admin, but I'm desperate for suggestions and insights into what could be going wrong.

Community
  • 1
  • 1
Sarah
  • 1,614
  • 1
  • 23
  • 37

1 Answers1

18

This cryptic error appeared because an input file that's requested during program execution wasn't actually present. Each node would attempt to load this file and then fail, but this would result only in a "cannot open the connection" message.

What this means is that almost anything can cause a "connection" error. Incredibly annoying!

Sarah
  • 1,614
  • 1
  • 23
  • 37
  • What sort of input file? Were you `source`ing something? – Roman Luštrik Dec 17 '11 at 20:56
  • All the R files were sourced properly. The program would try to load a .csv file that wasn't present (`data <- read.table("dataTable.csv")`) – Sarah Dec 17 '11 at 21:36
  • See https://stackoverflow.com/questions/16895848/results-of-workers-not-returned-properly-snow-debug for some help on debugging these problems. – mob Oct 17 '17 at 14:50