3

In Python3, I have essentially the following code:

server.py:

import os
import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(("127.0.0.1", 10000))
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.listen(5)

while True:
    print("waiting")
    connection, client_address = sock.accept()
    print("received")
    child_pid = os.fork()
    if child_pid == 0:
        print("connection received")
        received = connection.recv(1024)
        connection.sendall("OK".encode('utf-8'))
        os._exit(0)

client.py:

import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 10000))    
sock.close()

When I start the server and then the client, each time the client finishes a zombie process remains.

How to change the code so that no zombie process remains?

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
Alex
  • 41,580
  • 88
  • 260
  • 469
  • [Similar question](http://stackoverflow.com/questions/2760652/how-to-kill-or-avoid-zombie-processes-with-subprocess-module), this one using the [`subprocess`](https://docs.python.org/3/library/subprocess.html) module instead of trying to manage child threads directly. – Matthew Cole Mar 29 '17 at 19:24
  • Are these indeed [zombies](https://en.wikipedia.org/wiki/Zombie_process), that is, processes in terminated state? Maybe they are alive, just lost the parent process? – 9000 Mar 29 '17 at 19:35
  • 2
    This is a [duplicate of this question](https://stackoverflow.com/questions/18090230/forking-python-defunct-child) – user2722968 Mar 29 '17 at 19:42
  • 1
    Possible duplicate of [Forking python, defunct child](http://stackoverflow.com/questions/18090230/forking-python-defunct-child) – 9000 Mar 29 '17 at 20:40

2 Answers2

4

The usual technique is to track all the child pids so that they can be killed when the main process exits or whenever you want the children to be cleaned-up.

You can periodically poll and reap processes as needed or wait until you're about to exit.

For an example of how to do this, look at the collect_children() code in the ForkingMixin for the SocketServer module.

The os module has a number of tools for managing the subprocesses like os.wait() and os.kill.

I don't know whether it fits your problem or not, but a multiprocessing.Pool() may be of some help. It automatically manages a pool of subprocesses and reuses them for future tasks. It is mainly helpful when there is only limited data exchange between the processes and whether the work is relatively homogenous (all the child processes are doing the same kind of work).

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • But the main process runs a very long time – Alex Mar 29 '17 at 19:26
  • Yes, maybe this is a suitable way. Whenever a new child is created, the old one gets destroyed. Not really nice, but at least I won't end up with hundreds of zombie processes... – Alex Mar 29 '17 at 19:29
  • @Alex I think this is just part of the "facts of life" when creating child processes. They need to be tracked and no other tool can automatically know what you want to do with them. – Raymond Hettinger Mar 29 '17 at 19:34
0

When a process exits, it remains in the process table until something reads its return code. Assuming this is linux, you could make it a daemon and have the init process deal with it. But you could also just call os.waitpid yourself. Here is an example of a class that waits for pids in the background. Its nice becaue it keeps your program from exiting until its fully tidied itself up. You could expand it to do things like sending kill signals to child processes, logging results, and etc.

import threading
import queue
import os
import time

class ZombieKiller(threading.Thread):
    """Okay, really its just a zombie waiter, but where's the fun in that?
    """
    def __init__(self):
        super().__init__()
        self.pid_q = queue.Queue()
        self.start()

    def run(self):
        while True:
            pid = self.pid_q.get()
            if pid is None:
                return
            print(pid, 'wait')
            os.waitpid(pid, 0)
            print(pid, 'wait done')

    def cull_zombie(self, pid):
        self.pid_q.put(pid)

    def close(self):
        self.pid_q.put(None)
        self.join()

def test():
    zombie_killer = ZombieKiller()
    for i in range(3):
        pid = os.fork()
        if pid == 0:
            # child
            time.sleep(5)
            print(os.getpid(), 'work done')
            exit()
        else:
            # parent
            zombie_killer.cull_zombie(pid)
    zombie_killer.close()
    print('test complete')


test()
tdelaney
  • 73,364
  • 6
  • 83
  • 116