I am running into a problem with the foreach
section of a program I am working with in R. The program is used to run simulations for varying parameters, and then return the results to a single list which is then used to generate a report.
The problem occurs when not all simulation runs assigned are actually visible on the report. In all ways, it appears as though only a subset of the assigned runs were actually
assigned.
This is more likely to take place with larger data sets (longer time periods for a simulation, for example).
It is less likely to occur with a fresh run of the program, and more likely to occur if something is already taking up RAM.
The memory use graph for system monitor sometimes peaks at 100% RAM and 100% swap, and then dips sharply, after which time one of the four child R sessions has disappeared.
When using .verbose
in foreach()
, the log file shows that the simulation runs that do not get shown in the report are returned as NULL
, while those which do get shown in the report are returned as normal (a list of data frames and character variables).
The same set of parameters can produce this effect or can produce a complete graph; that is, the set of parameters is not diagnostic.
foreach()
is used for approximately a dozen parameters. .combine
is cbind
, .inorder
is false, all other internal parameters such as .errorhandling
are default.
This is of course quite irritating, since the simulations can take upwards of twenty minutes to run only to turn out to be useless due to missing data. Is there a way to either ensure that these "dropped" sessions are not dropped, or that if they are then this is in some way caught?
(If it's important, the computer being used has eight processors and hence runs four child processes, and the parallel operator registered is from the DoMC
package)
The code is structured roughly as follows:
test.results <- foreach(parameter.one = parameter.one.space, .combine=cbind) %:%
foreach(parameter.two = parameter.two.space, .combine=cbind) %:%
...
foreach(parameter.last = parameter.last.space, .combine=cbind, .inorder=FALSE) %dopar%
{
run.result <- simulationRun(parameter.one,
parameter.two,
...
parameter.last)
list(list(parameters=list(parameter.one,
parameter.two,
...
parameter.last),
runResult <- run.result))
}
return(test.results)