1

I noticed some problems with parallel processing in my environment, I got unexplained NULL values in my lists. Below is a simple example that produces NULLs.

library(parallel)

print(sessionInfo())
print(paste("Number of cores:", detectCores()))

list_a <- list()

# Assing  values from 1 to 100 to the list
for (i in 1:100) {
  list_a[i] <- i
  }

res <- mclapply(list_a, function(x) {x*x}, mc.cores = 28)

# Print length of list_a, unlist(res) and number of nulls in res
print(paste("Length of the list_a is", length(list_a)))
print(paste("Length of the unlist(res) is", length(unlist(res))))
print(paste("Number of nulls in res is", 
            sum(unlist(lapply(res, is.null)))))

Below is the print when I run the script using Rscript

pietvil@90113001SR001  $ Rscript --no-init-file mclapplydebug3.R
R version 3.5.1 (2018-07-02)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.10 (Santiago)

Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] compiler_3.5.1
[1] "Number of cores: 56"    
[1] "Length of the list_a is 100"
[1] "Length of the unlist(res) is 97"
[1] "Number of nulls in res is 3"

If I run mclapply using 29 cores, I also get following error

Error in sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) :
      write error, closing pipe to the master
    Calls: mclapply -> lapply -> FUN -> sendMaster

If I use less than 28 cores I do not get any NULL values. However, I am running very time consuming script and I would like to utilize as many cores as possible. Any idea what to do?

Edit1: I noticed this problem after a major os update on our servers. I have a complicated mclapply and a foreach %dopar% loop which both started to return unexpected null values. While investigating this problem, I noticed that even this simple example returned null values and that's why I posted this one. Even foreach with this example returns some null values sometimes in my environment.

Edit2: I tried this example in another server (RHEL 7.6) and I do not get any null values in that environment.

Edit3: If I run res <- mclapply(list_a, function(x) {x*x}, mc.cores = 28) again later in the script, it does not always produce NULLs, but sometimes it does.

0 Answers0