5

My question is related to this question. However the question referenced above uses multicore package which was replaced by parallel. Most of the functions in the response cannot be replicated in the parallel package. Is there a way to track progress in mclapply. In looking at the mclapply documentation, there is a parameter called mc.silent, I'm not sure if this would be able to track progress, and if so how and where we can see the log file? I'm running on ubuntu linux OS. Please see below for a reproducible example for which I would like to tack progress.

require(parallel) 

wait.then.square <- function(xx){
  # Wait for one second
  Sys.sleep(2);
  # Square the argument 
  xx^2 } 

output <- mclapply( 1:10, wait.then.square, mc.cores=4,mc.silent=FALSE)

Any help would be greatly appreciated.

Community
  • 1
  • 1
forecaster
  • 1,084
  • 1
  • 14
  • 35

2 Answers2

10

Thanks to the package pbmcapply you can now easily track progress of mclapply and mcmapply jobs. Just replace mclapply by pbmclapply:

wait.then.square <- function(xx) {
    Sys.sleep(2)
    xx^2 
} 

library(pbmcapply)
output <- pbmclapply(1:10, wait.then.square, mc.cores = 4)

...which will display a pretty progress bar.

The author has a nice blog post on the technical details and performance benchmarks here.

thie1e
  • 3,588
  • 22
  • 22
  • Thank you very much. Do we know if pbmclapply work in windows? – forecaster Nov 22 '16 at 00:38
  • mclapply generally does not work on Windows. On Windows the pbapply package is an option for tracking progress without parallelization. I don't think it supports parallel apply functions. – thie1e Nov 22 '16 at 01:00
4

This is an update of my related answer.

library(parallel)

finalResult <- local({
  f <- fifo(tempfile(), open="w+b", blocking=T)
  if (inherits(parallel:::mcfork(), "masterProcess")) {
    # Child
    progress <- 0.0
    while (progress < 1 && !isIncomplete(f)) {
      msg <- readBin(f, "double")
      progress <- progress + as.numeric(msg)
      cat(sprintf("Progress: %.2f%%\n", progress * 100))
    } 
    parallel:::mcexit()
  }
  numJobs <- 100
  result <- mclapply(1:numJobs, function(...) {
    # Do something fancy here... For this example, just sleep
    Sys.sleep(0.05)
    # Send progress update
    writeBin(1/numJobs, f)
    # Some arbitrary result
    sample(1000, 1)
  })
  close(f)
  result
})

cat("Done\n")
Community
  • 1
  • 1
fotNelton
  • 3,844
  • 2
  • 24
  • 35