I am submitting jobs to a queue on a cluster and want to check if the job is done. The way I do it is to see if the jobID
is present in the output of a command (called jobs
) that lists all the jobs that are currently running. I call jobs
via the shell, parse its output and see if jobID
is there. If it isn't, that's interpreted as a signal that the job terminated:
sleep = 2
while True:
output = subprocess.Popen("jobs %i" %(jobID),
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE).communicate()
if job_done(output):
break
time.sleep(sleep)
Since sleep
is set to 2, it means that this is checked every two seconds, but the job might run for several hours. I find that randomly I sometimes get the OSError
Cannot allocate memory
, even though there's a ton of memory on the machine and the thread does nothing that is memory intensive except check for the output of jobs
. What could be causing this? Is there a better way to do this than to use Popen
, PIPE
and communicate
?
This issue seems similar to the one reported here (Python subprocess.Popen "OSError: [Errno 12] Cannot allocate memory") but there was no resolution to this issue.