Using all cores for 2 nodes in a HPC

Question

I am trying to run a R code in a HPC environment. The HPC system has 8 nodes with 20 cores each. I wish to use 2 nodes, utilizing 40 cores in total. I am submitting the job through SLURM and it is running the .R file which has a parallel computing code in it. I have the following .sbatch code

#!/bin/bash
#SBATCH --job-name=my_r_script   # Job name
#SBATCH --nodes=2                 # Number of nodes
##SBATCH --ntasks-per-node=20       # Number of tasks per node
#SBATCH --ntasks=40                # Total number of tasks (cores)
#SBATCH --cpus-per-task=1         # Number of CPU cores per task
#SBATCH --mem=4G                  # Memory per node (e.g., 4G, 8G, 16G)
#SBATCH --time=1:00:00            # Wall clock time limit (hh:mm:ss)

module load R    # Load the R module 

Rscript my_Rscript.R

However when I see the results, I know that it is only using 20 cores from a single node and not 40 cores together. How do I write the .sbatch file to ensure that all 40 cores from 2 nodes are utilized to run the R code for parallel computing.

I have used the idea presented in the response here: https://stackoverflow.com/a/73828155/12493753 to understand --ntasks and --cpus-per-task=1.

I have seen https://stackoverflow.com/questions/54905099/slurm-use-cores-from-multiple-nodes-for-r-parallelization?rq=2 and the responses but I was wondering if there is a way to change the R-code or include two nodes together to run 40 cores. — Arnoneel Sinha, Jul 06 '23 at 16:46
Are you sure your code can utilise more than one node at all? — Bracula, Jul 09 '23 at 21:17

score 1 · Answer 1 · answered Jul 16 '23 at 03:54

Your Slurm submission is running only one copy of your R script on one node despite allocating the two nodes with Slurm, unless you engage MPI in your R code. My preferred way is to use the package pbdMPI to manage how many R sessions I run (and cooperation between sessions) and then use parallel package's mclapply to manage multicore shared-memory computing within each session. For example, with 4 R sessions, each using 10 cores, your Slurm submission would look something like this:

#!/bin/bash
#SBATCH --job-name=my_r_script   # Job name
#SBATCH --nodes=2                 # Number of nodes
#SBATCH --exclusive               # Use all cores on allocated nodes
#SBATCH --mem=4G                  # Memory per node (e.g., 4G, 8G, 16G)
#SBATCH --time=1:00:00            # Wall clock time limit (hh:mm:ss)

module load openmpi # To load OpenMPI - may be site-dependent
module load R    # Load the R module 

## Run 2 R sessions per node (map-by is OpenMPI-specific):
mpirun --map-by ppr:2:node Rscript my_Rscript.R

The use of the 10 cores would be done with parallel::mclapply(<parameters>, mc.cores = 10). You could also use all 40 cores with --map-by ppr:20:node, in which case you are running 40 R sessions. The latter would use more memory.

There are other ways to specify the same thing via either Slurm and OpenMPI. Unfortunately there are site-dependent defaults in Slurm, site-dependent deployments of R, and different flavors of MPI.

Using all cores for 2 nodes in a HPC

1 Answers1