37

In a sbatch script, you can directly launch programs or scripts (for example an executable file myapp) but in many tutorials people use srun myapp instead.

Despite reading some documentation on the topic, I do not understand the difference and when to use each of those syntaxes.

I hope this question is precise enough (1st question on SO), thanks in advance for your answers.

RomualdM
  • 853
  • 8
  • 11
  • is the scenario you have the same as the submission script I provided as an example in this question: https://stackoverflow.com/questions/72092272/do-sbatch-submission-scripts-in-slurm-really-need-the-srun-command-to-run-intend ? – Charlie Parker May 02 '22 at 20:54
  • can you provide an example sbatch submission script for your question? – Charlie Parker May 02 '22 at 21:23
  • @CharlieParker This question dates from my previous job: I don't have any access to a Slurm HPC now and won't be able to provide any reliable example ‍♂️ – RomualdM May 22 '22 at 12:36

1 Answers1

30

The srun command is used to create job 'steps'.

First, it will bring better reporting of the resource usage ; the sstat command will provide real-time resource usage for processes that are started with srun, and each step (each call to srun) will be reported individually in the accounting.

Second, it can be used to setup many instances of a serial program (program that only use one CPU) into a single job, and micro-schedule those programs inside the job allocation.

Finally, for parallel jobs, srun will also play the important role of starting the parallel program and setup the parallel environment. It will start as many instances of the program as were requested with the --ntasks option on the CPUs that were allocated for the job. In the case of a MPI program, it will also handle the communication between the MPI library and Slurm.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
  • Thanks a lot for this precise answer – RomualdM Dec 05 '18 at 21:03
  • 2
    In the case of setting up many instances of a serial program, a typical case is `srun -N1 -n1 myprog &` right? If the sbatch job allocation is over > 1 node, then will `srun` ensure each instance runs on an independent CPU better than just `myprog &`? In fact, what happens if the script simply has `myprog &` and the allocation is over > 1 node? – bernie Mar 12 '19 at 16:17
  • 1
    if the script simply has `myprog &` and the allocation is over > 1 node, only the first node will have processes running, and those processes will fight for access to the same CPUs – damienfrancois Apr 12 '19 at 12:30
  • what if I have GPUs -- single and multiple? – Charlie Parker May 02 '22 at 20:53
  • would an example run with srun be `srun python main.py`? Asking cuz I only know with `srun hostname` – Charlie Parker May 02 '22 at 20:56