12

So I've read this nice article about asynch threads in python. Tough, the last one have some troubles with the GIL and threads are not as effective as it may seems.

Luckily python incorporates Multiprocessing which are designed to be not affected by this trouble.

I'd like to understand how to implement a multiprocessing queue (with Pipe open for each process) in an async manner so it wouldn't hang a running async webserver .

I've read this topic however I'm not looking for performance but rather boxing out a big calculation that hangs my webserver. Those calculations require pictures so they might have a significant i/o exchange but in my understanding this is something that is pretty well handled by async.

All the calcs are separate from each other so they are not meant to be mixed.

I'm trying to build this in front of a ws handler.

If you hint heresy in this please let me know as well :)

Pawy
  • 239
  • 1
  • 3
  • 6
  • Is there a reason why you wouldn't be interested in python hreading module (https://docs.python.org/3/library/threading.html)? – Alceste_ Aug 10 '17 at 02:03
  • Yes, CPU threads have a tendency to lock the GIL which makes the hole infra slower. edit : explained here http://dabeaz.blogspot.fr/2010/02/revisiting-thread-priorities-and-new.html – Pawy Aug 10 '17 at 09:07
  • Someone nice on #python gave me hints about async executor ; After some research it seems the full answer is here https://pythonadventures.wordpress.com/tag/processpoolexecutor/ – Pawy Aug 10 '17 at 12:05
  • You can make it an answer and accept it so your thread get solved. ;) – Alceste_ Aug 10 '17 at 12:09

1 Answers1

11

This is re-sourced from a article after someone nice on #python irc hinted me on async executors, and another answer on reddit :

(2) Using ProcessPoolExecutor “The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. ProcessPoolExecutor uses the multiprocessing module, which allows it to side-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned.”

import asyncio
from concurrent.futures import ProcessPoolExecutor

def cpu_heavy(num):
    print('entering cpu_heavy', num)
    import time
    time.sleep(10)
    print('leaving cpu_heavy', num)
    return num

async def main(loop):
    print('entering main')
    executor = ProcessPoolExecutor(max_workers=3)
    data = await asyncio.gather(*(loop.run_in_executor(executor, cpu_heavy, num) 
                                  for num in range(3)))
    print('got result', data)
    print('leaving main')


loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))

And this from another nice guy on reddit ;)

Pawy
  • 239
  • 1
  • 3
  • 6
  • 1
    I also wanted to point to this question which might be better formulated and contains another "version" of the answer. https://stackoverflow.com/questions/27290656/should-i-use-two-asyncio-event-loops-in-one-program/27298880#27298880 ; Difference being in one spawning spawning only one process at the time whereas another spawns bunch of them. So depending on yours needs you might want to use run_in_executor alone, or if you want to split out a calc, I guess you'll be using asyncio.gather as well – Pawy Aug 10 '17 at 13:37
  • g e n i u s. simply genius. – An Se Dec 15 '20 at 14:09
  • Note that you'll need to run last two lines in `if __name__ == '__main__':` clause or your spawned processes will also try to start processes, causing `concurrent.futures.process.BrokenProcessPool` error. – 김민준 Aug 07 '21 at 01:14