4

Let's say I'm tasked with migrating a Flask project to an async Python webserver. I'm looking to for patterns to minimize the amount of work here. It appears to me, more or less, impossible to port sync webservers into async webservers incrementally. Which makes me think I've misunderstood async.

Suppose I want to make use of an asyncio sql library, and use it in an asyncio webserver, we might have to change the following stack of methods to async:

if __name__=='__main__':
    asyncio.get_event_loop().run_until_complete(main)

> async def main()
  > async def subfunc()
    > async def subsubfunc()
      > async def decorator1()
        > async def decorator2()
          > async def webapi()
            > async def decorator3()
              > async def decorator4()
                > async def servicemethod()
                  > async def servicemethod_impl()
                                ....
                    > async def decorator5()
                      > async def decorator6()
                        > async def repositorylayer()
                          > async def sqllibrary()
                            > async def sqllibrary2()
                              > async def asyncio_socket.read()

^^ because we want to wait on asyncio_socket.read(), then every function in the stack needs to be changed have the async def function declaration and also await on its dependency. This has some serious consequences for refactoring:

  • we need to change up to n function to get the benefit of one asyncio_socket.read(), most of which care little about whether the socket read is sync or async. That is we MUST declare each dependent function async and await the dependency's result (!)
  • any function that used to depend on any function in this stack (but is not in this stack) must also change to be async. (!) Event units tests, which we might not be interested in switching to async today must changed:
result = oldtest()
assert result==expected
result = asyncio.get_event_loop().run_until_complete(oldtest())
assert result==expected

Generally, any sync function that calls an async function needs to be refactored async-await -- that is async is contageous feature. Any code that calls async it must be infected with async, whether its cares about async or not.

Because this means a global refactoring, it doesn't seem practical to incrementally port a webservice from sync-land to async-land in any except the smallest projects. I've seen solutions that move execution to threads at the sync/async barrier. However, this would seems to: - introduce thread safety issues - remove the benefits of async must be the communication and context switching - reduce execution throughput because of the GIL.

However, in principle, it should be possible to call async functions from sync function:

def syncfunc2():
   result = asyncio.get_event_loop().run_until_complete(asyncfunc1())
   return result

async def asyncfunc3():
   result = await asyncfunc2()
   return result

def syncfunc4():
   result = asyncio.get_event_loop().run_until_complete(asyncfunc3())
   return result

However, for reason that aren't clear, Python doesn't allow this and fails with:

RuntimeError: This event loop is already running

I think it is possible to safely implement re-entrant event loops. We use to do this for threaded executors when we ran out of threads -- the caller of run_until_complete could drive execution of event loop until it returns, after which execution is returned to the original executor (which prevents a no-more-executors-but-waiting-on-execution deadlock). This is particularly easy in Python, because the GIL allows us to trivially guarantee that the event_loop is either:

  • not being driven by another function
  • is waiting for the current function to call await

and so its safe to pull a task from the queue and execute it. Because Python complains if you re-enter run_until_complete, it prohibits this, and also prohibits incremental introduction of async.

So:

  • why isn't run_until_complete re-entrant?
  • is it possible to incrementally introduce async in large codebases without resorting to additional threads (and the corresponding loss of benefit of async).
  • is it the case that async has effectively forked the python codebase into those libraries that use async, and those that do not?

Related:

user48956
  • 14,850
  • 19
  • 93
  • 154

1 Answers1

2

What you want to do is to use gevent. It'll allow you to serve multiple responses concurrently without threads or major modifications of synchronous code.

If you want to use asyncio-based approach, you'll have to deal with the fact, that every function that deals with network I/O (like every controller or db interaction) should be rewritten to be async. It is done intentionally to help fighting with concurrency-realted problems. It is the only way to always know for sure places where function can suddenly yield.

Mikhail Gerasimov
  • 36,989
  • 16
  • 116
  • 159
  • 1
    The last sentence is the key reason _why_ it's disallowed to nest runs of the same event loop. Being able to tell where a coroutine will yield just by looking at awaits is a major design goal of asyncio, and the reason awaits (previously yield froms) are explicit. – user4815162342 Jan 15 '20 at 10:01
  • Thank - will look at gevent. My big beef with asyncio is that it requires code to be modified even if it doesn't deal with network I/O. It (needlessly) even requires modifying decorators with async that do things like as unrelated at timing function calls. It didn't need to be this way ... and it make me sad they went this route. – user48956 Jan 15 '20 at 22:05
  • 1
    "you'll have to deal with the fact, that every function that deals with network I/O (like every controller or db interaction) should be rewritten to be async" ... OK. Yes. But async is also requiring that things that are not I/O related be reimplemented to made async also, such as a @cached decorator. "It is the only way to always know for sure places where function can suddenly yield." ... at runtime. This viral behavior will introduce a multitude of new bugs when porting legacy. You can't just update I/O functions. This is awful design. – user48956 Jan 16 '20 at 00:50