2

I use the library multiprocessing in a flask-based web application to start long-running processes. The function that does it is the following:

def execute(self, process_id):
    self.__process_id = process_id
    process_dir = self.__dependencies["process_dir"]
    self.fit_dependencies()
    process = Process(target=self.function_wrapper, name=process_id, args=(self.__parameters, self.__config, process_dir,))
    process.start()

When I want to deploy some code on this web application, I restart a service that restarts gunicorn, served by nginx. My problem is that this restart kills all children processes started by this application as if a SIGINT signal were sent to all children. How could I avoid that ?

EDIT: After reading this post, it appears that this behavior is normal. The answer suggests to use the subprocess library instead. So I reformulate my question: how should I proceed if I want to start long-running tasks (which are python functions) in a python script and make sure they would survive the parent process OR make sure the parent process (which is a gunicorn instance) would survive a deployement ?

FINAL EDIT: I chose @noxdafox answer since it is the more complete one. First, using process queuing systems might be the best practice here. Then as a workaround, I can still use multiprocessing but using the python-daemon context (see here ans here) inside the function wrapper. Last, @Rippr suggests using subprocess with a different process group, which is cleaner than forking with multiprocessing but involves having standalone functions to launch (in my case I start specific functions from imported libraries).

Robin
  • 1,531
  • 1
  • 15
  • 35
  • At a glance, I would try to decouple this method from the flask application to an entirely separate process. That way, the child processes won't die with the flask app. For instance, you might be able to delegate this work to a celery app. – sytech Mar 18 '18 at 17:59
  • Thx. Celery seems to be the main recommended option for that. But it involves adding more dependencies to my project, which I would like to avoid. There is really no way to start a completely independent process from within a python script ? – Robin Mar 18 '18 at 18:05
  • Still I don't understand why this is happening. The children should become child of init a keep going until their terminate their code. – Robin Mar 18 '18 at 18:09
  • Depends what your needs are for communicating with that process, and the implementation may differ based on whether you're using Python2 or Python3 and what platform you're using. Check out [this question](https://stackoverflow.com/questions/27624850/launch-a-completely-independent-process) which may have some answers for you. – sytech Mar 18 '18 at 18:20
  • I handle the communication with files. I'm just trying to understand why the children are shut down instead of becoming orphans.. – Robin Mar 20 '18 at 15:21
  • @Robin can you explain how you got `python-daemon` and `multiprocessing` to do what you want? Wen I'm using `DaemonContext` inside a script, I no longer see any of the script outputs... – bluesummers Nov 10 '20 at 09:54
  • You can't see the output because the subprocess is "detached" from it parents to survive in case they are killed. To keep seing the logs or get results, you need to work with an external memory e.g. Redis, files, or any database. – Robin Dec 02 '20 at 08:53

3 Answers3

4

I would recommend against your design as it's quite error prone. Better solutions would de-couple the workers from the server using some sort of queuing system (RabbitMQ, Celery, Redis, ...).

Nevertheless, here's a couple of "hacks" you could try out.

  1. Turn your child processes into UNIX daemons. The python daemon module could be a starting point.
  2. Instruct your child processes to ignore the SIGINT signal. The service orchestrator might work around that by issuing a SIGTERM or SIGKILL signal if child processes refuse to die. You might need to disable such feature.

    To do so, just add the following line at the beginning of the function_wrapper function:

    signal.signal(signal.SIGINT, signal.SIG_IGN)
    
noxdafox
  • 14,439
  • 4
  • 33
  • 45
  • Thank you, this is a good sum up. Could you just elaborate about the " it's quite error prone" part ? – Robin Mar 21 '18 at 13:00
  • And what do you think of the use of subprocess ? – Robin Mar 21 '18 at 13:10
  • As I said, several frameworks try to prevent child processes from leaking. Therefore you might see the child processes being hard killed even ignoring `SIGINT`. There are better solution for doing what you are trying to achieve: you can orchestrate the micro services via `docker-compose` or `supervisor` for example. `systemd` could also fit your purpose. These systems were designed to do what you are trying to achieve with bare python. – noxdafox Mar 22 '18 at 14:10
  • To run a python function with `subprocess` you would need to write it in a separate file and call it via `Popen` for example. This would add quite a lot of overhead as you are basically starting a brand new Python interpreter. `multiprocessing` instead (on UNIX), smartly forks the parent process which is much faster. – noxdafox Mar 22 '18 at 14:12
3

Adding on to @noxdafox's excellent answer, I think you can consider this alternative:

subprocess.Popen(['nohup', 'my_child_process'], preexec_fn=os.setpgrp)

Basically, the child processes are killed because they belong to the same process group as the parent. By adding the preexec_fn=os.setpgrp parameter, you are simply requesting the child processes to spawn in their own process groups which means they will not receive the terminate signal.

Explanation taken from here.

Ishan Manchanda
  • 459
  • 4
  • 19
  • Thank you for the explanation about the process groups. To fit to my situation, it would be perfect to be able to do something like: multiprocessing.Process(function_wrapper, args, preexec_fn=os.setpgrp). – Robin Mar 26 '18 at 11:36
  • Unfortunately, process groups are not yet supported in `multiprocessing` even though there is a stub for its implementation, [check this](https://docs.python.org/3/library/multiprocessing.html?highlight=process#multiprocessing.Process). – noxdafox Mar 26 '18 at 13:33
1

Ultimately, this problem comes down to misunderstanding of what it means to do the deployment. In non-enterprise languages like Python (compared to enterprise ones like Erlang), it is generally understood that deployment wipes out any of the preceding artefacts of running the process. As such, it would clearly be a bug if your old children/functions don't actually terminate once a new deployment is performed.

To play the devil's advocate, it is even unclear from your question/spec of what your actual expectation for the deployment is — do you simply expect your old "functions" to run forever? How do those functions get started in the first place? Who's supposed to know whether or not those "functions" were modified in a given deployment, and whether or not they're supposed to be restarted and in what fashion? A lot of consideration to these very questions is given in Erlang/OTP (which are unrelated to Python), and, as such, you can't simply expect the machine to read your mind when you use a language like Python that's not even designed for such a use-case.

As such, it may be a better option to separate the long-running logic from the rest of the code, and perform the deployment appropriately. As the other answer mentions, this may involve spawning a separate UNIX daemon directly from within Python, or maybe even using an entirely separate logic to handle the situation.

cnst
  • 25,870
  • 6
  • 90
  • 122
  • Deployment here refers to the fact that the web application sometimes needs to be updated (new code replaces old codes). But I want the processes launched by the app to keep running since they are atomic computations, i.e. unrelated to the web app itself. Turning them to UNIX deamon seems a good option to me. – Robin Mar 26 '18 at 11:33
  • The fact that python isn't specifically designed for that isn't much a problem. There are plenty of ways to design such a process management, but as my question suggests, it s a tricky thing because its highly related to the inner UNIX structure. – Robin Mar 26 '18 at 11:34