4

I've been trying to get a parallelized foreach loop running in R, it works fine for approximately ten iterations but then crashes, showing the error:

Error in { : task 7 failed - "missing value where TRUE/FALSE needed"
Calls: %dopar% -> <Anonymous>
Execution halted

I append the results of each loop to a file, which does show the output to be as expected. My script is as followed,using the combn_sub function from this post:

LBRA <- fread(
 input      = "LBRA.012",
 data.table = FALSE)
str_bra <- nrow(LBRA)

br1sums <- colSums(LBRA)
b1non <- which(br1sums == 0)

LBRA_trim <- LBRA[,-b1non]

library(foreach)
library(doMC)
registerDoMC(28)

foreach(X = seq(2, (nrow(LBRA)-1))) %dopar% {
  com <- combn_sub(
   x    = nrow(LBRA),
   m    = X,
   nset = 1000)

  out_in <- matrix(
   ncol = 2,
   nrow = 1)
   colnames(out) <- c("SNPs", "k")

    for (A in seq(1, ncol(com))){
      rowselect <- com[, A]

      sub <- LBRA_trim[rowselect, ]
      subsum <- colSums(sub)

      length <- length(which(subsum != 0)) - 1
      out_in <- rbind(out_in, c(length, X))
    }

  write.table(
   file   = "plateau.csv",
   sep    = "\t",
   x      = out_in,
   append = TRUE)
}
Community
  • 1
  • 1
JohnnyTooBad
  • 77
  • 2
  • 6
  • 1
    Have you tried running the first 10 iterations without using `%dopar%` to see which line it's failing on? My guess is that there's something in that line of data that's different from the other lines of data that isn't being accounted for. – tblznbits Feb 08 '16 at 16:20
  • 2
    Writing to a file from inside a parallel loop seems ill advised to me. – Roland Feb 08 '16 at 16:27
  • 1
    Explicitly create your multicore cluster before call `registerDoMC(28)` and specify a logfile. That way you can see what is going on: `cluster <- makeCluster(28, outfile="MulticoreLogging.txt"); registerDoMc(cluster);` – Mekki MacAulay Feb 08 '16 at 16:31

3 Answers3

3

I had a similar problem with my foreach call...

tmpcol <- foreach(j = idxs:idxe, .combine=cbind) %dopar% { imp(j) }

Error in { : task 18 failed - "missing value where TRUE/FALSE needed"

Changing the .errorhandling parameter only ignores the error

tmpcol <- foreach(j = idxs:idxe, .combine=cbind, .errorhandling="pass") %dopar% { imp(j) }

Warning message:
In fun(accum, result.18) :
  number of rows of result is not a multiple of vector length (arg 2)

I suggest running the function in your foreach call for X=7. The problem in my case was my function, imp(j), was throwing an error (for j=18, it was hitching on an NA calculation) which resulted in the vague output from foreach.

Nathan Dyjack
  • 166
  • 1
  • 4
1

As @Roland points out, it's a very bad idea to write to a file within a foreach loop. Even writing in append mode, the individual cores will attempt to write to the file simultaneously and may clobber each other's input. Instead, capture the results of the foreach statement using the .combine="rbind" option and then write to file after the loop:

cluster <- makeCluster(28, outfile="MulticoreLogging.txt");
registerDoMc(cluster);

foreach_outcome_table <- foreach(X = seq(2, (nrow(LBRA)-1)), .combine="rbind") %dopar% {

  print(cat(paste(Sys.info()[['nodename']], Sys.getpid(), sep='-'), "now performing loop", X, "\n"));

  com <- combn_sub(x = nrow(LBRA), m = X, nset = 1000);

  out_in <- matrix(ncol = 2,nrow = 1);

  colnames(out_in) <- c("SNPs", "k");

  for (A in seq(1, ncol(com))){
    rowselect <- com[, A];

    sub <- LBRA_trim[rowselect, ];
    subsum <- colSums(sub);

    length <- length(which(subsum != 0)) - 1;
    out_in <- rbind(out_in, c(length, X));
  }
  out_in;
}
write.table(file = "plateau.csv",sep = "\t", x = foreach_outcome_table, append = TRUE);

Further, you could replace the inner for loop with a nested foreach loop which would probably be more efficient.

Mekki MacAulay
  • 1,727
  • 2
  • 12
  • 23
  • Thanks for the advice. My initial script didn't write to a file within the loop, but I added it to see at which point the crash occurred. – JohnnyTooBad Feb 09 '16 at 10:47
  • That makes sense. The multicore logging when creating the cluster is a better option to reduce collisions. You can even have the individual cores specify their ID so you can figure out which one is doing what. I write status updates from within the foreach loop using `print(cat(paste(Sys.info()[['nodename']], Sys.getpid(), sep='-'), "now performing action X"));` or something similar to keep track. – Mekki MacAulay Feb 09 '16 at 14:27
0

There could be many reasons for the error, "missing value where TRUE/FALSE needed".

What helped for me was to remove the %dopar% and run the same code on a single item. This revealed more/clearer error messages which, I think, get lost when running in parallel. My error had nothing to do with the %dopar% itself.