5

In a nutshell I am trying to parallelise my whole script over dates using Snow and adply but continually get the below error.

Error in unserialize(socklist[[n]]) : error reading from connection
In addition: Warning messages:
1: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’

2: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’

I have set up the parallelisation process in the following way:

Cores = detectCores(all.tests = FALSE, logical = TRUE)
cl = makeCluster(Cores, type="SOCK")
registerDoSNOW(cl)
clusterExport(cl, c("Var1","Var2","Var3","Var4"), envir = environment())


exposureDaily <- adply(.data = dateSeries,.margins = 1,.fun = MainCalcFunction,
                       .expand = TRUE, Var1, Var2, Var3, 
                       Var4,.parallel = TRUE)

stopCluster(cl)

Where dateSeries might look something like

> dateSeries
  marketDate
1 2016-04-22
2 2016-04-26

MainCalcFunction is a very long script with multiple of my own functions contained within it. As the script is so long reproducing it wouldn't be practical, and a hypothetical small function would defeat the purpose as I have already got this methodology to work with other smaller functions. I can say that within MainCalcFunction I call all my libraries, necessary functions, and a file containing all other variables aside from those exported above so that I don't have to export a long list libraries and other objects.

MainCalcFunction can run successfully in its entirety over 2 dates using adply but not parallelisation, which tells me that it is not a bug in the code that is causing the parallelisation to fail.

Initially I thought (from experience) that the parallelisation over dates was failing because there was another function within the code that utilised parallelisation, however I have subsequently rebuilt the whole code to make sure that there was no such function.

I have poured over the script with a fine tooth comb to see if there was any place where I accidently didn't export something that I needed and I can't find anything.

Some ideas as to what could be causing the code to fail are:

  • The use of various option valuation functions in fOptions and rquantlib
  • The use of type sock

I am aware of this question already asked and also this question, and while the first question has helped me, it hasn't yet help solve the problem. (Note: that may be because I haven't used it correctly, having mainly used loginfo("text") to track where the code is. Potentially, there is a way to change that such that I log warning and/or error messages instead?)

Please let me know if there is any other information I can provide to help in solving this. I would be so appreciative if someone could provide some guidance, as the code takes close to 40 minutes to run for a day and I need to run it for close to a year, therefore parallelisation is essential!

EDIT

I have tried to implement the suggestion in the first question included above by utilising the outfile option. Given I am using Windows, I have done this by including the following lines before the exporting of the key objects and running MainCalcFunction :

reportLogName <- paste("logout_parallel.txt", sep="")
addHandler(writeToFile,  
           file = paste(Save_directory,reportLogName, sep="" ),
           level='DEBUG')
with(getLogger(), names(handlers))

loginfo(paste("Starting log file", getwd()))

mc<-detectCores()
cl<-makeCluster(mc, outfile="")  
registerDoParallel(cl)

Similarly, at the beginning of MainCalcFunction, after having sourced my libraries and functions I have included the following to print to file:

reportLogName <- paste(testDate,"_logout.txt", sep="")
    addHandler(writeToFile,  
               file = paste(Save_directory,reportLogName, sep="" ),
               level='DEBUG')
    with(getLogger(), names(handlers))

    loginfo(paste("Starting test function ",getwd(), sep = ""))

In the MainCalcFunction function I have then put loginfo("text") statements at key junctures to inform me of where the code is at.

This has resulted in some text files being available after the code fails due to the aforementioned error. However, these text files provide no more information on the cause of the error aside from at what point. This is despite having a tryCatch statement embedded in MainCalcFunction where at the end, on any instance of error I have added the line logerror(e)

Community
  • 1
  • 1
Celeste
  • 337
  • 4
  • 15

2 Answers2

6

I am posting this answer in case it helps anyone else with a similar problem in the future.

Essentially, the error unserialize(socklist[[n]]) doesn't tell you a lot, so to solve it it's a matter of narrowing down the issue.

  • Firstly, be absolutely sure the code runs over several dates in non-parallel with no errors
  • Ensure the parallelisation is set up correctly. There are some obvious initial errors that many other questions respond to, e.g., hidden parallelisation inside the code which means parallelisation is occurring twice.
  • Once you are sure that there is no problem with the code and the parallelisation is set up correctly start narrowing down. The issue is likely (unless something has been missed above) something in the code which isn't a problem when it is run in serial, but becomes a problem when run in parallel. The easiest way to narrow down is by setting outfile = "Log.txt" in which make cluster function you use, e.g., cl<-makeCluster(cores-1, outfile="Log.txt"). Then add as many print("Point in code") comments in your function to narrow down on where the issue is occurring.

In my case, the problem was the line jj = closeAllConnections(). This line works fine in non-parallel but breaks the code when in parallel. I suspect it has something to do with the function closing all connections including socket connections that are required for the parallelisation.

Celeste
  • 337
  • 4
  • 15
-2

Try running using plain R instead of running in RStudio.