0

So I have this process that can be massively parallelized (up to 81x increase) but can't seem to get it to run in parallel with asyncio in a docker container.

The original code

def kernel(xarg):
    // this actually calls to another library which launches a subprocess.
    // Takes about 16 seconds
    return xarg**2

def main():
    foo = [i for i in range(81)]
    ret_vals = [kernel(i) for i in foo]

As you can imagine this is quite slow. So I decided to speed it up a bit using asyncio

import asyncio

async def kernel(xarg):
    return xarg**2

def main():
    loop = asyncio.get_event_loop()
    foo = [i for i in range(81)]
    ret_vals = loop.run_until_complete(asyncio.gather(*[kernel(i) for i in foo]))

However there is absolutely no speedup even when I increase cpu requests for the docker container this runs in.

Please let me know why this isn't working to speed things up. Sorry if it is really basic!

Jon E
  • 99
  • 7
  • async is for *concurrency*, not *parallelism*. It requires a task to *pause* for another to run. This can massively speed up I/O bound tasks, but will slow down CPU bound tasks. On top of that, your code isn't even concurrent because the kernel never pauses. – MisterMiyagi Mar 11 '21 at 05:58
  • Does this answer your question? [How does asyncio actually work?](https://stackoverflow.com/questions/49005651/how-does-asyncio-actually-work) – MisterMiyagi Mar 11 '21 at 06:03
  • Note that if the bulk of the work is done by subprocesses, then you already have natural parallelism. You merely have to not restrict it to just the main and one subprocess. While async can be used for this, a threading Pool (e.g. from concurrent.futures) will be both simpler and slightly faster. – MisterMiyagi Mar 11 '21 at 06:07
  • @MisterMiyagi yes and no. I understand now why it doesn't work, but am not sure how to allow for multiple subprocceses. I'm not sure how to allow the subproccess to launch as they are part of this 3rd party library https://pypi.org/project/cdo/, where the subprocesses are launched here: https://github.com/Try2Code/cdo-bindings/blob/9e3ecaab40ad2d1358a8b5dcdc7103f61adfb056/python/cdo.py#L294 As such I'm really confused as to how to use multiple subprocesses to execute this. Is this something that can be accomplished with semaphores? Please excuse my naiveté. – Jon E Mar 11 '21 at 17:11

0 Answers0