1

I have Flask application that exposed API that can run application in background and can kill it later by specifying the PID. However, for unit testing purpose, after killing the PID and checking if the PID is killed using psutil.pid_exists(pid) it seems to always returning true. I have checked manually that PID does not exist and run psutil.pid_exists(pid) on different python console and it returned true. This is causing my test to fail.

In views.py, I have:

@api.route('/cancel/<pid>', methods=['POST'])
def cancel(pid=None):
    try:
        os.kill(int(pid), signal.SIGTERM)

        data = dict(
            message = 'Successfully killed pid ' + pid)

        return jsonify(status='success', data=data), 200
    except:
        data = dict(
            message = 'Fail to kill job with pid ' + pid)
        return jsonify(status='error', data=data), 400

And in my test:

def test_cancel_job(self):
    # run_script will run something in the background and return the PID
    jobid, pid, cmd = run_script('fake_db', 'fake_cancel_jobid', 'tests/doubles/child.py')

    if not psutil.pid_exists(pid):
        raise Exception('Process is not running')

    # kill the job and assert it is successful
    resp = self.client.post('/api/cancel/' + str(pid))
    self.assert200(resp)

    # at this point, I have confirmed that the PID has been killed
    # but, the line below still get executed
    # psutil.pid_exists(pid) returns true

    # check pid is really killed
    if psutil.pid_exists(pid):
        raise Exception('Process {0} still exist'.format(pid))

I'm running on OSX if that make any difference.

Update: I've tried running on the test on my build server (Ubuntu 14.04) and the test failed.

This is my run_script

def run_script(db, jobid, script):
    log = SCRIPTS_LOG + jobid + ".log"

    if not os.path.exists(SCRIPTS_LOG):
        os.makedirs(SCRIPTS_LOG)

    with open(log, "w") as output:
        cmd = ["nohup", "python", script, db]
        p = subprocess.Popen(cmd, stdout=output)

        return jobid, p.pid, " ".join(cmd)

and my child.py

#!/usr/bin/env python

import time
import os, sys
if 'TEST_ENV' not in os.environ:
    os.environ['TEST_ENV'] = 'Set some env'

    try:
        os.execv(sys.argv[0], sys.argv)
    except Exception, exc:
        print "Failed re-exec:", exc
        sys.exit(1)


def main(argv):
    db = argv[0]

    while True:
        print 'Running child with params ', db
        time.sleep(1)


if __name__ == '__main__':
    main(sys.argv[1:])

I added a simple scripts that demonstrate this. https://github.com/shulhi/kill-pid/tree/master

number5
  • 15,913
  • 3
  • 54
  • 51
Shulhi Sapli
  • 2,286
  • 3
  • 21
  • 31
  • try using `signal.CTRL_C_EVENT` ? – itzMEonTV Apr 28 '15 at 05:49
  • Are you running your Flask app as root? also what's the version of `psutil`? – number5 Apr 28 '15 at 05:56
  • Can you check `pid` existence by sending [`os.kill(pid, 0)`](http://stackoverflow.com/questions/13595076/why-does-os-killpid-0-return-none-although-process-has-terminated) on OS X? – Peter Wood Apr 28 '15 at 06:01
  • @number5 I'm using version 2.2.1. I tried running the test both with and without sudo. Both are giving me the same result. – Shulhi Sapli Apr 28 '15 at 06:49
  • @PeterWood I've tried using that too, but gave me the same result. I followed this http://stackoverflow.com/questions/568271/how-to-check-if-there-exists-a-process-with-a-given-pid – Shulhi Sapli Apr 28 '15 at 06:50
  • I've test `psutil 2.2.1` with ipython on Mavericks, it works as in the docs. Would you mind also paste the code of `run_script`? Also if you only need to manage the processes you launched within your app, you might not need `psutil` at all – number5 Apr 28 '15 at 07:09
  • @number5 It doesn't work inside a flask application for me. It works fine if I just fire up python console and check the `pid` manually. – Shulhi Sapli Apr 28 '15 at 07:15
  • @number5 I added a simple scripts to demonstrate the issues. It doesn't even work on a simple script. – Shulhi Sapli Apr 28 '15 at 07:31
  • Ah, you using `nohup` that's why. You won't get the correct pid using nohup that way, try not using nohup, just python [...] – number5 Apr 28 '15 at 07:50
  • If I use `nohup` then my script will block and I don't want to daemonize it, I just need it to run in background. After further inspection, the PID does get killed but the status is `zombie`, that's the reason it is returning `true`. However, this is not the case when running on Ubuntu, for some weird reason the status is still `running` – Shulhi Sapli Apr 28 '15 at 08:03
  • @ShulhiSapli it's even easier to block without `nohup`, just use `p.wait()`, see https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait – number5 Apr 28 '15 at 16:52
  • FWIW, I had a similar problem with zombie child processes on SuSE. In my case I checked the running status with p.poll() and also had to specifically delete the popen object reference after the process ended to prevent it from becoming a zombie (that reference was causing the child to become a zombie in the 1st place). – Dan Cornilescu Apr 29 '15 at 15:57
  • 1
    I solved this. Apparently I need `os.wait()` right after I kill the process so the process can be reaped. If not, it will become zombie process and stop being zombie after the parent is killed. – Shulhi Sapli Apr 30 '15 at 02:10

1 Answers1

0

It takes time to terminate a running process. In this case it's a child process, so os.wait() or one of its variants will know exactly how long to wait. To future-proof, i'd use os.waitpid(pid, os.WEXITED).

Cees Timmerman
  • 17,623
  • 11
  • 91
  • 124