0

I am trying to use multiple cores to run my R script on a big dataset. I have tried packages like parallel, doMC, caret. In most cases, the mutipleCore operation should be based on loops. However, there is no loop in my script, is it still possible to run with multiple cores?

This is how my original script looks like. It works fine with one core.

inFile = "filename" # A very BIG file

myFunction <- function(File){
  ...
  igraph(data) # A few time-consuming functions, very SLOW.
  spinglass(graph1)
  ...
}

myFunction(inFile)

This is how I try to use multiple cores:

library(parallel)
library(doMC)
library(caret)

inFile = "filename"

myFunction <- function(File){
   ...
   igraph()
   ...
}

mclapply(inFile, myFunction, mc.cores = 4)

In mclapply(x,func,..), X is supposed to be a list. In my script, I try to use the inFile name as the only element in the list. However, it only uses one core.

PS: I run the script in the terminal instead of GUI.

FewKey
  • 152
  • 9
  • Can you split the input-data in several chunks and let them be processed on different cores? Also it would be nice to have some dummy-data/functions to test it – SeGa Jul 19 '18 at 10:02
  • @SeGa Thank you for your reply. Unfortunately, the input file is a large correlation matrix and it is not possible to split it. Do you mean that multiple threading processing is only applicable when the task can be split into several subtasks? One function can only run with one core at a time? – FewKey Jul 19 '18 at 10:44
  • There is a difference in multicore-processing and multithreading. Check out this [answer](https://stackoverflow.com/a/11835474/3682794). If the task cannot be split up in several independent tasks, then (in my opinion) you can not use parallelization. – SeGa Jul 19 '18 at 11:28

0 Answers0