How should I call external programs from sub-instances of parallelized R? The problem could occur also on other contexts, but I am using library(foreach)
and library(doFuture)
on slurm-based HPC. As an example, I have created a hello.txt
that contains "hello world"
, and in my R script I have the following lines just before and within the %dopar% {}
:
message(getwd())
system("echo 'hello directly'")
system("cat hello.txt")
The result in the .out
file of the sbatch
run looks like this, after I have asked for two %dopar%
iterations:
/lustre/scratch/myuser
hello directly
hello world
/lustre/scratch/myuser
/lustre/scratch/myuser
Error in { : task 2 failed - "cannot open the connection"
Calls: %dopar% -> <Anonymous>
Thus, the main R instance on the login node and the sub-instances on the computing nodes seem to have the same working directory, and dealing with the same files hasn't been a problem earlier with the native R functions. However, executing the system()
on the computing nodes fails for some reason. Any help?