0

I have a following code, which should start a job and kill the process if it takes too long to finish.

import random
from datetime import datetime
from subprocess import check_output, STDOUT, TimeoutExpired

MAX_WAIT = 5

def custom_task(job=None):
    sec = job or random.randint(4,10)
    print('\nThis task will sleep {s} sec.'.format(s=sec))
    print('-/- {time} - - Commence new job <{job}>'.format(
             job=sec, time=datetime.now()))
    try:
        cmd = 'sleep {s}; echo "Woke up after {s} sec." | tee -a task.log'.format(s=sec)
        stdout = check_output(cmd, shell=True, stderr=STDOUT, timeout=MAX_WAIT).decode()
    except TimeoutExpired:
        print('~/~ {time} - - Job <{job}> has been cancelled'.format(
                 job=sec, time=datetime.now()))
    except Exception as err:
        print('!/! {time} - - Job <{job}> could not finish because of the error'.format(
                 job=sec, time=datetime.now()))
        print('{err}'.format(err=err))
    else:
        print('=/= {time} - - Job <{job}> has been done'.format(
                 job=sec, time=datetime.now()))
        print(stdout)

custom_task(4)
custom_task(8)

Console gives me the following output:

This task will sleep 4 sec.
-/- 2016-11-03 01:07:56.037104 - - Commence new job <4>
=/= 2016-11-03 01:08:00.051072 - - Job <4> has been done
Woke up after 4 sec.


This task will sleep 8 sec.
-/- 2016-11-03 01:08:00.051233 - - Commence new job <8>
~/~ 2016-11-03 01:08:08.062563 - - Job <8> has been cancelled

Note that task which was supposed to sleep 8 sec released the block after 8 sec, and not after expected MAX_WAIT = 5

But if I check tasks.log I see:

Woke up after 4 sec.

This means that only 4 sec task has finished successfully, which is desired and expected. Hence my script kind of work, but in rather unexpected and undesired way.

Is there a way to release the block (kill the process) on time, thus as soon as MAX_WAIT timeout was exceeded?

And what is going on here, why python waits until sleep finishes sleeping?

EDIT - A WORKING EXAMPLE

from random import randint
from os import killpg
from signal import SIGKILL
from datetime import datetime
from subprocess import Popen, STDOUT, TimeoutExpired, PIPE

MAX_WAIT = 5

def custom_task(job=None):
    sec = job or randint(4,10)
    print('\n# This task will sleep {s} sec.'.format(s=sec))
    print('-/- {time} - - Commence new job <{job}>'.format(
          job=sec, time=datetime.now()))
    cmd = 'sleep {s}; echo "Woke up after {s} sec." | tee -a tasks.log'.format(s=sec)
    with Popen(cmd, shell=True,
               stdout=PIPE, stderr=STDOUT,
               close_fds=True,
               universal_newlines=True,
               start_new_session=True) as proc:
        try:
            stdout = proc.communicate(timeout=MAX_WAIT)[0]
        except TimeoutExpired:
            print('~/~ {time} - - Job <{job}> has been cancelled'.format(
                  job=sec, time=datetime.now()))
            killpg(proc.pid, SIGKILL)
            stdout = proc.communicate(timeout=1)[0]
        except Exception as err:
            print('!/! {time} - - Job <{job}> could not finish because of the error'.format(
                  job=sec, time=datetime.now()))
            print('{err}'.format(err=err))
            killpg(proc.pid, SIGKILL)
        else:
            print('=/= {time} - - Job <{job}> has been done'.format(
                  job=sec, time=datetime.now()))
        print('# Return code: {}'.format(proc.returncode))
        print(stdout)

custom_task(4)
custom_task(30)

OUTPUT

$ time python3 popen.py 

# This task will sleep 4 sec.
-/- 2016-11-05 15:38:13.833871 - - Commence new job <4>
=/= 2016-11-05 15:38:17.842769 - - Job <4> has been done
# Return code: 0
Woke up after 4 sec.


# This task will sleep 30 sec.
-/- 2016-11-05 15:38:17.842942 - - Commence new job <30>
~/~ 2016-11-05 15:38:22.849511 - - Job <30> has been cancelled
# Return code: -9


real    0m9.095s
user    0m0.087s
sys 0m0.000s
NarūnasK
  • 4,564
  • 8
  • 50
  • 76
  • Workaround that I commonly use: wrap your command incantation with the unix command timeout. You can then catch the return code status and process it as per the man page. – Lmwangi Nov 03 '16 at 10:04
  • Thanks for the tip, I suppose I like it even better than `python` implementation, though it makes my script less portable, but in this case it's not a problem. – NarūnasK Nov 05 '16 at 15:50

1 Answers1

0

There are actually several processes called in the check_output call. So when subprocess sends the kill signal, it will not be sent to all of the descendant processes. The check_output may wait for the other processes to finish. For a solution look at this post on StackOverflow.

Community
  • 1
  • 1
J. P. Petersen
  • 4,871
  • 4
  • 33
  • 33