I'm trying to get parallel processing to work on my local installation of RStudio or on RStudio cloud by using the doParallel
package and following the tutorial here.
Unfortunately, turning on parallel processing seems to slow computation, rather than speed it up.
Test operation:
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Results without parallel processing
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %do% sum(tanh(1:i)) 183.1157 196.3723 222.237 206.3648 227.4821 417.8161 100
user system elapsed
0.33 0.04 0.19
Results after turning on parallel processing - takes 2x as much time!
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 331.3142 371.2502 406.0369 389.7049 412.8814 814.3407 100
user system elapsed
0.28 0.10 0.37
How strange! Any tips? Below I include the full script I ran as well as logs from my local RStudio session and that from RStudio cloud.
Full Script
install.packages('doParallel')
library(doParallel)
install.packages('microbenchmark')
library(microbenchmark)
# Without parallel processing
microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))
system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))
# Without parallel processing, get a warning
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
# Turn on parallel with several cores
registerDoParallel(detectCores() - 2)
# See number of cores
getDoParWorkers()
# Test for speed improvement With parallel processing
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
# Return to one worker
registerDoParallel(1)
registerDoSEQ()
Log from local run:
Restarting R session...
Warning message:
<REDACTED LINE>
Error 6 (The handle is invalid)
Features disabled: R source file indexing, Diagnostics
Error in summary.connection(connection) : invalid connection
Error in summary.connection(connection) : invalid connection
<REDACTED LINE>
> install.packages('doParallel')
Installing doParallel [1.0.16] ...
OK [linked cache]
> library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Warning messages:
1: package ‘doParallel’ was built under R version 4.0.3
2: package ‘foreach’ was built under R version 4.0.3
3: package ‘iterators’ was built under R version 4.0.3
> install.packages('microbenchmark')
Installing microbenchmark [1.4-7] ...
OK [linked cache]
> library(microbenchmark)
Warning message:
package ‘microbenchmark’ was built under R version 4.0.3
>
> # Without parallel processing
> microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %do% sum(tanh(1:i)) 183.1157 196.3723 222.237 206.3648 227.4821 417.8161 100
>
> system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))
user system elapsed
0.33 0.04 0.19
>
> # Without parallel processing, get a warning
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 178.1788 188.879 213.9808 197.2124 227.6921 698.484 100
Warning message:
executing %dopar% sequentially: no parallel backend registered
>
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
user system elapsed
0.22 0.03 0.25
>
> # Turn on parallel with several cores
> registerDoParallel(detectCores() - 2)
>
> # See number of cores
> getDoParWorkers()
[1] 6
>
> # Test for speed improvement With parallel processing
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 331.3142 371.2502 406.0369 389.7049 412.8814 814.3407 100
>
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
user system elapsed
0.28 0.10 0.37
>
> # Return to one worker
> registerDoParallel(1)
> registerDoSEQ()
Log from RStudio cloud:
Restarting R session...
> install.packages('doParallel')
Installing package into ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0’
(as ‘lib’ is unspecified)
trying URL 'http://package-proxy/src/contrib/doParallel_1.0.16.tar.gz'
Content type 'application/x-tar' length 59776 bytes (58 KB)
==================================================
downloaded 58 KB
* installing *binary* package ‘doParallel’ ...
* DONE (doParallel)
The downloaded source packages are in
‘/tmp/RtmplDZYAT/downloaded_packages’
> library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
> install.packages('microbenchmark')
Installing package into ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0’
(as ‘lib’ is unspecified)
trying URL 'http://package-proxy/src/contrib/microbenchmark_1.4-7.tar.gz'
Content type 'application/x-tar' length 61382 bytes (59 KB)
==================================================
downloaded 59 KB
* installing *binary* package ‘microbenchmark’ ...
* DONE (microbenchmark)
The downloaded source packages are in
‘/tmp/RtmplDZYAT/downloaded_packages’
> library(microbenchmark)
>
> # Without parallel processing
> microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %do% sum(tanh(1:i)) 121.6417 126.5681 130.8152 129.7511 133.3043 171.6484 100
>
> system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))
user system elapsed
0.126 0.000 0.126
>
> # Without parallel processing, get a warning
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 117.6518 124.2508 127.9016 127.1467 129.9798 171.9952 100
Warning message:
executing %dopar% sequentially: no parallel backend registered
>
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
user system elapsed
0.169 0.000 0.169
>
> # Turn on parallel with several cores
> registerDoParallel(detectCores() - 2)
>
> # See number of cores
> getDoParWorkers()
[1] 14
>
> # Test for speed improvement With parallel processing
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
expr min lq mean median uq max neval
foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 262.9285 302.7655 340.1377 325.8734 359.3806 707.4004 100
>
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
user system elapsed
0.136 0.176 0.313
>
> # Return to one worker
> registerDoParallel(1)
> registerDoSEQ()
>