I answered this question with the assumption that there is some lack of knowledge on some of the fundamental understandings of threads/processes/async loop. If there was not, forgive me for the amount of detail.
First thing to note is that processes and threads are two separate concepts. This answer might give you some context. To expand:
Processes are run directly by the CPU, and if the CPU has multiple cores, processes can be run in parallel. Inside processes is where threads are run. There is always at least 1 thread per process, but there can be more. If there are more, the process switches between which thread it is executing after every (specific) millisecond (dictated by things out of the scope of this question)- and therefore threads are not run in absolute parallel, but rather constantly switched in and out of the CPU (at least as it pertains to Python, specifically, due to something called the GIL). The async loop runs inside a thread, and switches context relating specifically to I/O-bound instructions (more of this below).
Regarding this question, it's worth noting that Gunicorn workers are processes, and not threads (though you can increase the amount of threads per worker).
The intention of asynchronous code (with the use of async def
, await
, and asyncio
) is to speed-up performance as it specifically relates to I/O bound tasks. Stuff like getting a file from disk, sending/receiving a network request, or anything that requires a physical piece of your computer - whether it is SSD, or the network card - other than the CPU to do some work. It can also be used for large CPU-bound instructions, but this is usually where threads come in. Note that I/O bound instructions are much slower than CPU bound instructions as the electricity inside your computer literally has to travel further distances, as well as perform extra steps in the hardware level (to keep things simple).
These tasks waste the CPU time (or, more specifically, the current process's time) on simply waiting for a reply. Asynchronous code is run with the help of a loop
that auto-manages the context switching of I/O bound instructions and normal CPU bound instructions (dependent on the use of await
keywords) by leveraging the idea that a function can "yield" control back to the loop, and allow the loop to continue processing other pieces of code while it waits. When async code sends an I/O bound instruction (e.g. grab the latest packet from the network card), instead of sitting still and waiting for a reply it will switch the current process' context to the next task in its list to speed up general execution time (adding that previous I/O bound call to this list to check back in later). There is more to this, but this is the general gist as it relates to your question.
This is what it means when the docs says:
While a Task is running in the event loop, no other Tasks can run in the same thread.
The async loop is not running things in parallel, but rather constantly switching context between different instructions for a more optimized CPU + I/O relationship/execution.
Processes, in the other hand, run in parallel in your CPU assuming you have multiple cores. Gunicorn workers - as mentioned earlier - are processes. When you run multiple async workers with Gunicorn you are effectively running multiple asyncio.loop
in multiple (independent, and parallel-running) processes. This should answer your question on:
Can both request 1 and request 2 try to execute the asyncio.run in parallel in different threads\processes ? Will one block the other ?
If there is ever the case that one worker gets stuck on some extremely long I/O bound (or even non-async computation) instruction(s), other workers are there to take care of the next request(s).