4

I am new to r programming as you can tell from the nature of my question. I am trying to take advantage of the parallel computing ability of the train function.

library(parallel)
#detects number of cores available to use for parallel package
nCores <- detectCores(logical = FALSE)
cat(nCores, " cores detected.")  

# detect threads with parallel()
nThreads<- detectCores(logical = TRUE)
cat(nThreads, " threads detected.")

# Create doSNOW compute cluster (try 64)
# One can increase up to 128 nodes
# Each node requires 44 Mbyte RAM under WINDOWS.
cluster <- makeCluster(128, type = "SOCK")
class(cluster);

I need someone to help me interpret this code. originally the first argument of makeCluster() had nthreads but after running

nCores <- detectCores(logical = FALSE)

I learned that I have 4 threads available. I changed the value based on the message provided in the guide. Will this enable me simultaneously run 128 iterations of the train function at once? If so what is the point of getting the number of threads and cores that my computer has in the first place?

zx8754
  • 52,746
  • 12
  • 114
  • 209
igbobahaushe
  • 65
  • 1
  • 7

2 Answers2

4

What you want to do is to detect first the amount of cores you have.

nCores <- detectCores() - 1

Most of the time people add minus 1 to be sure you have one core left to do other stuff on.

cluster <- makeCluster(nCores)

This will set the amount of clusters you want your code to run on. There are several parallel methods (doParallel, parApply, parLapply, foreach,..). Based on the parallel method you choose, there will run a method on one specific cluster you've created.

Small example I used in code of mine

  no_cores <- detectCores() - 1
  cluster <- makeCluster(no_cores)
  result <- parLapply(cluster, docs$text, preProcessChunk)
  stopCluster(cluster)

I also see that your making use of sock. Not sure if "type=SOCK" works. I always use "type=PSOCK". FORK also exists but it depends on which OS you're using.

FORK: "to divide in branches and go separate ways"
Systems: Unix/Mac (not Windows)
Environment: Link all

PSOCK: Parallel Socket Cluster
Systems: All (including Windows)
Environment: Empty
D.Dsn
  • 176
  • 10
0

I am not entirely convinced that the spec argument inside parallel::makeCluster is explicitly the max number of cores (actually, logical processors) to use. I've used the value of detectCores()-1 and detectCores()-2 in the spec argument on some computationally expensive processes and the CPU and # cores used==detectCores(), despite specifying to leave a little room (here, leaving 1 logical processor free for other processes).

The below example crude as I've not captured any quantitative outputs of the core usage. Please suggest edit.

You can visualize core usage by monitoring via e.g., task manager whilst running a simple example:

no_cores <- 5
cl<-makeCluster(no_cores)#, outfile = "debug.txt")
parallel::clusterEvalQ(cl,{
  library(foreach)
  
  foreach(i = 1:1e5) %do% {
    print(sqrt(i))
  }
})
stopCluster(cl)
#browseURL("debug.txt")

Then, rerun using e.g., ncores-1:

no_cores <- parallel::detectCores()-1

cl<-makeCluster(no_cores)#, outfile = "debug.txt")
    parallel::clusterEvalQ(cl,{
      library(foreach)
      
      foreach(i = 1:1e5) %do% {
        print(sqrt(i))
      }
    })
    stopCluster(cl)

All 16 cores appear to engage despite no_cores being specified as 15: Visual representation of all cores engaging in the parallel process despite spec argument = number cores minus 1.

Based on the above example and my very crude (visual only) analysis...it looks like it is possible that the spec argument tells the max number of cores to use throughout the process, but it doesn't appear the process is running on multiple cores simultaneously. Being a novice parallelizer, perhaps a more appropriate example is necessary to reject/support this?

The package documentation suggests spec is "A specification appropriate to the type of cluster."

I've dug into the relevant parallel documentation and and cannot determine what, exactly, spec is doing. But I am not convinced the argument necessarily controls the max number of cores (logical processors) to engage.

Here is where I think I could be wrong in my assumptions: If we specify spec as less than the number of the machine's cores (logical processors) then, assuming no other large processes are running, the machine should never achieve no_cores times 100% CPU usage (i.e., 1600% CPU usage max with 16 cores).

However, when I monitor the CPUs on a Windows OS using Resource Monitor), it does appear that there are, in fact, no_cores Images for Rscript.exe running.

Jessica Burnett
  • 395
  • 1
  • 13