Is asyncio.run_in_executor specified ambiguously?

Question

I have a server application and when requested by the client I schedule some work, like

def work():
    time.sleep(5)

fut = asyncio.get_event_loop().run_in_executor(None, work)

I await fut later when it is requested explicitly. My use case requires that run_in_executor submit the work function immediately, and that behaves as expected in my environment (Ubuntu 16.04, Python 3.7.1).

Since my application depends on this behavior I wanted to verify that it is not something likely to change, so I checked several resources:

The documentation seems kind of vague. awaitable seems like it may apply to the method or the return value - though the body of the text does say it returns an asyncio.Future explicitly.
PEP 3156 that specifies asyncio - here it says nothing close to run_in_executor being a coroutine.
In a few issues whether run_in_executor is a function that returns an awaitable or a coroutine itself seems to be considered an implementation detail. See 25675 and 32327.
AbstractEventLoop.run_in_executor is specified as a coroutine, but the implementation in BaseEventLoop.run_in_executor is a plain function.

1 and 2 mostly seem to indicate that the current behavior is correct, but 3 and 4 are concerning. This seems like a very important part of the interface because if the function itself is a coroutine then it will not begin executing (therefore will not schedule the work) until it is awaited.

Is it safe to rely on the current behavior? If so, is it reasonable to change the interface of AbstractEventLoop.run_in_executor to a plain function instead of a coroutine?

A simple way to resolve questions like "Is it guaranteed that this API will work this way on all platforms?" is to create unit tests and run them on all your target platforms. Then you don't have to wonder, you will know for sure. — John Zwinck, Jan 19 '19 at 04:48
@JohnZwinck, that's a good reminder that as long as the project has unit tests that run on every target platform/version then at least that helps reduce risk of a breaking change going unnoticed. — Chris Hunt, Jan 19 '19 at 16:16

user4815162342 · Accepted Answer · 2019-01-21T07:23:25.753

My use case requires that run_in_executor submit the work function immediately, and that behaves as expected in my environment

The current behavior is not guaranteed by the documentation, which only specifies that the function arranges for func to be called, and that it returns an awaitable. If it were implemented with a coroutine, it would not submit until run by the event loop.

However, this behavior was present since the beginning and it is extremely unlikely to change in the future. Delaying submitting, though technically allowed by the docs, would break many real-world asyncio applications and constitute a serious backwards-incompatible change.

If you wanted to ensure that the task starts without depending on undocumented behavior, you could create your own function equivalent to run_in_executor. It really boils down to combining executor.submit and asyncio.wrap_future. Without frills, it could be as simple as:

def my_run_in_executor(executor, f, *args):
    return asyncio.wrap_future(executor.submit(f, *args))

Because executor.submit is called directly in the function, this version guarantees that the worker function is started without waiting for the event loop to run.

PEP 3156 explicitly states that run_in_executor is "equivalent to wrap_future(executor.submit(callback, *args))", thus providing the needed guarantee - but the PEP is not the official documentation, and the final implementation and specification often diverge from the initial PEP.

If one insisted on sticking to the documented interface of run_in_executor, it is also possible to use explicit synchronization to force the coroutine to wait for the worker to start:

async def run_now(f, *args):
    loop = asyncio.get_event_loop()
    started = asyncio.Event()
    def wrapped_f():
        loop.call_soon_threadsafe(started.set)
        return f(*args)
    fut = loop.run_in_executor(None, wrapped_f)
    await started.wait()
    return fut

fut = await run_now(work)
# here the worker has started, but not (necessarily) finished
result = await fut
# here the worker has finished and we have its return value

This approach introduces unnecessary implementation and interface complexity, particularly jarring being the need to use await to obtain a future, which runs counter to how asyncio normally works. run_now is only included for completeness and I would not recommend using it in production.

Thanks. I have lodged this as [bpo-35792](https://bugs.python.org/issue35792) and for now will call `asyncio.wrap_future` directly as you suggested. — Chris Hunt, Jan 21 '19 at 03:01
@ChrisHunt Thanks, I had a mind to do the same. Note that "will not actually be scheduled until the coroutine is **awaited**" is perhaps a bit too strong - one doesn't need to await a coroutine, but only schedule it and wait for it to start running. But it doesn't affect the point of the bug report. — user4815162342, Jan 21 '19 at 07:25

Is asyncio.run_in_executor specified ambiguously?

1 Answers1