How does Uvicorn / Fastapi handle concurrency with 1 worker and synchronous endpoint?

Question

Understanding Uvicorn asynchrounous behavior

I am trying to understand the behavior of Uvicorn. I have create a sample fastapi app which mainly sleep for 5 seconds.

import time
from datetime import datetime


from fastapi import FastAPI


app = FastAPI()

counter = 0

@app.get("/")
def root():
    global counter
    counter = counter + 1
    my_id = counter
    print(f'I ({my_id}) am feeling sleepy')
    time.sleep(5)
    print(f'I ({my_id}) am done sleeping')
    return {}

I called my app using the following command of Apache Bench:

ab -n 5 -c 5 http://127.0.0.1:8000/

Output:

I (1) am feeling sleepy  -- 0s
I (1) am done sleeping   -- 5s
I (2) am feeling sleepy  -- 5s
I (3) am feeling sleepy  -- 5s
I (4) am feeling sleepy  -- 5s
I (5) am feeling sleepy  -- 5s
I (2) am done sleeping   -- 10s
I (4) am done sleeping   -- 10s
I (3) am done sleeping   -- 10s
I (5) am done sleeping   -- 10s

Why are requests running concurrently? I ran the app as:

uvicorn main:app --workers 1

Please note that I did not use the async keyword so for me everything should be completely synchronous.

From the FastAPI docs:

When you declare a path operation function with normal def instead of async def, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server).

Where is this threadpool? As I am using sleep, I though the only worker available would be completely blocked.

Might be relevant; https://github.com/tiangolo/fastapi/issues/4591 . They stated that no matter the `workers` count, there's still 6 threads, hmm. — doneforaiur, Jun 21 '23 at 15:38
tldr: FastAPI maintains an internal threadpool that requests gets handed off to when you're not defining your endpoints as async. — MatsLindh, Jun 21 '23 at 21:07
My questions is not a duplicate of the other one. In my case, I don't understand my the app is async when it should be synchronous. The answers in the linked question do not answer to my question. If every synchronous call is handled in the worker pool, why my calls are concurrent when there is only one worker? — poiuytrez, Jun 22 '23 at 07:46
@MatsLindh I found on one of your comments that FastAPI uses a threadpool of 40 threads internally to handle requests using non-async endpoints. Where did you get this info? Thanks! — poiuytrez, Jun 22 '23 at 08:14
I found the solution (maybe from you) in the middle of a github discussion: https://github.com/tiangolo/fastapi/issues/4221#issuecomment-982260467. Too bad that I can't add it as an answer to this question because it is closed. — poiuytrez, Jun 22 '23 at 08:23
Yes, the duplicate close is wrong (i.e. it should probably point to the question about the threadpool if anything, not the question Chris associated it with). While the issue is about how to change it, to find exactly _why_ it is 40, you'll have to go on a bit of journey - FastAPI uses Startlette's concurrency support to run requests in a threadpool: https://github.com/tiangolo/fastapi/blob/b7ce10079eb23873d5c54e264cd3618ac890d7b3/fastapi/concurrency.py#L8 - Starlette uses anyio: https://github.com/encode/starlette/blob/master/starlette/concurrency.py#L35 which in turn has 40 as the default — MatsLindh, Jun 22 '23 at 08:26
value for size of their threadpool: https://anyio.readthedocs.io/en/stable/threads.html#adjusting-the-default-maximum-worker-thread-count - so that's where the 40 number actually comes from (if I remember the comment thread correctly I initially thought the size was 100, but someone else pointed out that it was 40 - but that might have been a comment section before that question :-). So to sum it up: A single worker can run 40 "concurrent" threads (they'll be context switching, so .. not really, but it works fine), so it'll appear as concurrent, even with a single worker. — MatsLindh, Jun 22 '23 at 08:28
If you're running things as async and never giving up processing time by calling `await`, the number of workers will be identical to the number of requests you can handle at the same time - since each worker will be stuck processing a single connection. However, if you change your example to use `asyncio.sleep` instead (and define the method as async), you'll handle many thousand connections with a single worker at the same time (as they're just giving up their processing time by calling `await`). — MatsLindh, Jun 22 '23 at 08:29
@Chris OP is asking _why_ something is not behaving as expected given the worker count. While _you_ know that this is caused by the endpoint being defined as sync and that being run in a threadpool as your previous answer details, _the asker do not have this context to make that deduction_. The question is _why the actual number of workers don't affect this_. When closing a question as a duplicate, while an answer can be deduced from the linked question in some way and hidden somewhere in a paragraph, OP or future readers might not be able to make that deduction. — MatsLindh, Jun 22 '23 at 21:09

score 2 · Answer 1 · answered Jun 22 '23 at 12:33

FastAPI uses Starlette which uses AnyIO behind the scene. A thread pool of 40 threads is provided by default to handle synchronous requests. This thread pool is behind the magic of the multiple executions of concurrent synchronous requests.

This can be configured:

from anyio.lowlevel import RunVar
from anyio import CapacityLimiter

app = FastAPI()

@app.on_event("startup")
def startup():
    print("start")
    RunVar("_default_thread_limiter").set(CapacityLimiter(2))

Source: https://github.com/tiangolo/fastapi/issues/4221#issuecomment-982260467

Kudos to @MatsLindh.

Please have a look at [this answer](https://stackoverflow.com/a/71517830/17865804), which should sufficiently answer this question and help you understand how FastAPI works under the hood. — Chris, Jun 23 '23 at 03:07

How does Uvicorn / Fastapi handle concurrency with 1 worker and synchronous endpoint?

1 Answers1