0

I need to run 10,000 jobs on slurm (each taking 30 minutes let's say). Currently, the plan was to do it with a job array, using 250 cores in parallel, like so:

sbatch --array=0-10000%250 array_script.sh args

Unfortunately, the sys admin hasn't changed slurm's MaxArraySize (from the default 1001). To "circumvent" this, I was planning on slicing the overall job into 10 pieces, and somehow scheduling each piece so that it runs after the previous piece has finished. For example, I would start with:

sbatch --array=0-999%250 array_script.sh args

then when that is done, I would do:

sbatch --array=1000-1999%250 array_script.sh args

Now I need to somehow schedule this. I'm not that experienced with bash, and I had a python wrapper around everything so I thought I would do it with python (plus I'm using a python wrapper around the job array to do a lot of other stuff anyway). So how would I do this normally?

Currently I have:

        for i in range(num_slices):
            command = 'sbatch --array={lower_end}-{upper_end}%250 array_script.sh args'.format(lower_end=i*1000, upper_end=min((i+1)*1000-1, num_targets-1), args=args)
            subprocess.run(command, shell=True)
            << need to have a step that waits till the job is done >>

First of all, in the above, I run sbatch with subprocess.run, which means I currently don't know the JOB_ID. Is there a way to catch the output from subprocess.run or something that would allow me to find the JOB_ID. And how do I do the equivalent of squeue to check if the job is still running and decide whether to continue the loop?

Marses
  • 1,464
  • 3
  • 23
  • 40

2 Answers2

0

Slurm will queue all your jobs automatically, so you can send all of them at once.

As long as your jobs are independent, there is no need to wait for the completion of the current job array before sending the next one.

As for getting the subprocess output, you can find the answer here .

Charles
  • 81
  • 4
  • In this case, the maximum submission limit seems to count across the different jobs, even if they are dependent. So I need a scheduler otherwise any additional jobs that would push my total job size over the limit get rejected. – Marses Sep 22 '17 at 14:17
0

For other people stumbling onto this question: Your administrators have set a limit on the max number of jobs and number of array jobs for a good reason.

Slurm (and other job schedulers) tends to use resources on the management system proportional to the number of jobs in the queue. Too many jobs, and the scheduler itself will get bogged down, job scheduling will get delayed, which will increase the number of jobs in the queue even further. This can possibly lead to the system grinding to a complete halt.

If you need to submit a lot more jobs that the admins allow you to, the right way to do this is to contact the administrators. Explain what you want to do, and what you are trying to achieve. You may get permission to submit your jobs, or they may know of a better way to achieve your goals that doesn't involve running quite so many jobs.

HPC system administrators are generally very happy to discuss how to achieve your goals beforehand. They are a lot less happy putting out the fires caused by a user trying to circumvent a limit that was there for a good technical reason.

Janne
  • 979
  • 4
  • 13