103

I have a job running a linux machine managed by slurm. Now that the job is running for a few hours I realize that I underestimated the time required for it to finish and thus the value of the --time argument I specified is not enough. Is there a way to add time to an existing running job through slurm?

user1701545
  • 5,706
  • 14
  • 49
  • 80

3 Answers3

142

Use the scontrol command to modify a job

scontrol update jobid=<job_id> TimeLimit=<new_timelimit>

Use the SLURM time format, eg. for 8 days 15 hours: TimeLimit=8-15:00:00

Requires admin privileges, on some machines.

Will be allowed to users only if the job is not running yet, on most machines.

Carles Fenoy
  • 4,740
  • 1
  • 26
  • 27
6

To build on the example provided above, you can also use "+" and "-" to increment / decrement the TimeLimit.

From the [scontrol man page][https://slurm.schedmd.com/scontrol.html]:

either specify a new time limit value or precede the time and equal sign with a "+" or "-" to increment or decrement the current time limit (e.g. "TimeLimit+=30")

We regularly get requests like "I need 3 more hours for job XXXXX to finish!!!", which would translate to:

scontrol update job=XXXXX TimeLimit=+03:00:00
FlakRat
  • 71
  • 1
  • 3
  • 3
    The astute reader might note that the manual actually states _precede the time **and equal** sign with a "+" or "-"_, but your example puts the + after the =. However, I can confirm that either order seems to work. – MrArsGravis May 31 '23 at 18:00
1

If you haven't specified a walltime in your Slurm job script, Slurm will typically use the default walltime specified in your Slurm cluster configuration. To increase the walltime of a running job in Slurm, you can use the scontrol command to modify the job's time limit. Here's the command you can use:

scontrol update JobID=<job_id> TimeLimit=<new_walltime>

Replace <job_id> with the actual job ID of the job you want to modify, and <new_walltime> with the new walltime you want to set for the job. Make sure you have the necessary permissions to modify the job, as this may require administrative privileges or job ownership.

eg: scontrol update JobID=12345 TimeLimit=2-00:00:00

This example increases the walltime of the job with ID 12345 to 2 days.