0

I have 3 functions: func_1, func_2, and func_3. I would like to run these asynchronously, so that I do not have to wait for func_1 to finish before func_2 starts executing.

The problem is, that the definition of func_1 for example looks something like this:

async def func_1(a, b):
    x = some_sync_func(a)
    y = some_other_sync_func(b)
    z = another_sync_func(x, y)
    return yet_another_sync_func(z)

The functions that I am calling within func_1 are all synchronous functions which are non-awaitable. Thus, they will block the execution of func_2 and func_3.

I read here that loop.run_in_executor() can be used to call synchronous functions from asynchronous functions without blocking the execution. Thus, I modified the definition of func_1 as follows:

async def func_1(a, b):
    loop = asyncio.get_event_loop()
    x = await loop.run_in_executor(None, some_sync_func, a)
    y = await loop.run_in_executor(None, some_other_sync_func, b)
    z = await loop.run_in_executor(None, lambda: another_sync_func(a,b))
    w = await loop.run_in_executor(None, yet_another_sync_func, z)
    return w

Is this the right way to deal with this problem? Am I using loop.run_in_executor() correctly? Here, the docs provide an example which seems to support this. I don't know what threads are, or what a "process pool" is, and haven't really been able to make much sense of the docs.

Datajack
  • 88
  • 3
  • 16
  • 1
    Seems right to me. Be aware that Python is not very efficient at multithreading, so the gain of the change may not be as great as expected. This depends on what the synchronous functions actually do. – Michael Butscher Jun 11 '23 at 13:22
  • 1
    Yes, these awaits make sure that while the long sync functions are running, execution gets suspended to the event loop, which gets to service other tasks running at the same time. Also, consider replacing `run_in_executor` with the more modern [`asyncio.to_thread`](https://docs.python.org/3/library/asyncio-task.html#asyncio.to_thread). – user4815162342 Jun 11 '23 at 20:48
  • 1
    *I don't know what threads are, or what a "process pool" is* - threads are units of execution that run in parallel inside the your process. A thread pool is a collection of threads to which you can submit tasks that you want to run asynchronously. A process pool is like a thread pool, but giving tasks to external processes. `run_in_executor()` uses an underlying abstraction that allows thread or process pools being used. If you just pass `None` as the first argument, it will use a thread pool set up by the system by default. As mentioned in the previous comment, you can also use `to_thread`. – user4815162342 Jun 11 '23 at 20:54

1 Answers1

1

Almost right, but since you are awaiting eagerly at each function call, the next line of code in each case (after the await) will only be called when the line with await finishes execution.

However if you call func_1 in parallel from some other place, two instances of func_1 will work in parallel. (I am almost sure that is not what you want).

So, in order for these other functions to actually run in parallel (in other threads), you have to create the task to run each of them, but not await it immediately, instead, you gather all the tasks you want to run in parallel and await for them at once (usually with a function properly named gather ):

...

async def func_1(a, b):
    loop = asyncio.get_event_loop()
    task_x = loop.run_in_executor(None, some_sync_func, a)
    task_y = loop.run_in_executor(None, some_other_sync_func, b)
    task_z = loop.run_in_executor(None, lambda: another_sync_func(a,b))
    x, y, z = await asyncio.gather(task_x, task_y, task_z)
    # this depends on `z` so, it is not included in the gather. 
    # if its return value is not important, you can ommit the 
    # await, return the task, and await for it sometime later.
    w = await loop.run_in_executor(None, yet_another_sync_func, z)
    return w
...
jsbueno
  • 99,910
  • 10
  • 151
  • 209