Parallelize a function in R

Question

I am running a sound analysis function on around 25k short audio files. My code works but will take a very long time to run. What would be a good approach to parallelize it?

Thanks a lot,

files = list.files(getwd(), pattern=".mp3", all.files=FALSE, full.names=FALSE)

out=NULL

for (i in files) {

res <- try(soundgen::analyze(i,pitchMethods = 'dom',  plot = FALSE, summary = TRUE), silent = TRUE)
res["1", "duration"] <- i[[1]]
out=rbind(out,res)
print(i)
}

I wrongly(perhaps) edited your title mainly because most base functions are vectorized already. Take a look at this: https://stackoverflow.com/questions/5571774/what-is-the-easiest-way-to-parallelize-a-vectorized-function-in-r — NelsonGon, Jul 02 '19 at 20:08
Using `rbind` within a loop is very time consuming. It is best to preallocate the space and then assign the resulting value. Another option to try is is the `parLapply` function from the parallel package — Dave2e, Jul 02 '19 at 20:08
Possible duplicate of [run a for loop in parallel in R](https://stackoverflow.com/questions/38318139/run-a-for-loop-in-parallel-in-r) — divibisan, Jul 02 '19 at 20:29

score 1 · Answer 1 · answered Jul 02 '19 at 20:26

You can use the parallel package to achieve this easily:

library(parallel)
library(soundgen)

files <- list.files(getwd(), pattern=".mp3", all.files=FALSE, full.names=FALSE)

out <- NULL

soundAnalysis <- function(file){
  res <- try(soundgen::analyze(file,pitchMethods = 'dom',  plot = FALSE, summary = TRUE), silent = TRUE)
  res["1", "duration"] <- file[[1]]
  out <- rbind(out,res)
  print(i)
}

output <- mclapply(X = files, FUN = soundAnalysis, mc.cores = detectCores())

Parallelize a function in R

1 Answers1