async http call taking twice as long as it should

Question

I am learning async and semaphore, I have made an endpoint using fastapi to test against. The fastapi end point is just a simple server side that takes a request with a sleep time in it and will sleep that long before returning the response. I run the fastapi via uvicorn for testing purpose but with 5 workers. this is for testing only, i understand fro production i should use gunicorn and nginx, but for the purposes of learning i am just using uvicorn

uvicorn api_example:app --port 8008 --host cdsre.co.uk --workers 5

api_example.py

from time import sleep, time
import json
from fastapi import FastAPI, Response, Request


app = FastAPI()


@app.post("/sleeper")
async def sleeper(request: Request):
    request_data = await request.json()
    sleep_time = request_data['sleep_time']
    start = time()
    sleep(sleep_time)
    end = time()
    return Response(content=json.dumps({"slept for": end - start}))

client code on my local machine is trying to utilise async and semaphore to call 3 post requests in parallel. I have 6 requests, and sleep timer of 5 seconds. So expectation here is that it should take 10 seconds approx to process the 6 requests.

async_example.py

import aiohttp
import asyncio
import time


async def get_http_response(session, url):
    async with semaphore:
        print("firing request...")
        start = time.time()
        async with session.post(url, json={"sleep_time": 5}) as resp:
            response = await resp.text()
            end = time.time()
            print(f"Client time: {end - start}, server time: {response}")
            return response


async def main():
    async with aiohttp.ClientSession() as session:
        tasks = []
        for number in range(6):
            url = f'http://cdsre.co.uk:8008/sleeper'
            tasks.append(asyncio.ensure_future(get_http_response(session, url)))
        responses = await asyncio.gather(*tasks)
        for response in responses:
            pass
            # print(response)


semaphore = asyncio.Semaphore(3)
start_time = time.time()
asyncio.get_event_loop().run_until_complete(main())
print("--- %s seconds ---" % (time.time() - start_time))

However frequently the post request takes twice as long as it should. In this case with a sleep timer of 5 seconds some requests take 10 seconds.

firing request...
firing request...
firing request...
Client time: 5.137275695800781, server time: {"slept for": 5.005105018615723}
firing request...
Client time: 10.158655643463135, server time: {"slept for": 5.0042970180511475}
Client time: 10.158655643463135, server time: {"slept for": 5.001959800720215}
firing request...
firing request...
Client time: 5.055504560470581, server time: {"slept for": 5.005110025405884}
Client time: 5.056135654449463, server time: {"slept for": 5.005115509033203}
Client time: 5.107320070266724, server time: {"slept for": 5.005107402801514}
--- 15.271023750305176 seconds ---

sometimes its 3 times as slow, its always by a factor of my sleep time, which makes me think there is some sort of queuing happening or some sort of race condition I am missing,however i thought the whole purpose of the semaphore pattern was to avoid these race conditions such that my limiting to 3 requests at any one time is always going to be less than the works available on the server side (5 workers) so there should always be a working available server side to process it.

I also dont start time timing until inside the semaphore so I am not starting it early so it should only start the timer when its sending the request. Hopefully i am just missing something obvious. I have left the end point url up if anyone wants to try it. I would appreciate any help in solving this. Essentially i need to be able to write an async client that can send request in parallel up to a limit and be consistent in measureing the response time.

Exmaple of some taking 3 times as long

firing request...
firing request...
firing request...
Client time: 15.127191305160522, server time: {"slept for": 5.001192808151245}
firing request...
Client time: 15.127155303955078, server time: {"slept for": 5.005094766616821}
Client time: 15.127155303955078, server time: {"slept for": 5.005074977874756}
firing request...
firing request...
Client time: 5.053789854049683, server time: {"slept for": 5.005076169967651}
Client time: 5.100871801376343, server time: {"slept for": 5.005076885223389}
Client time: 10.107984781265259, server time: {"slept for": 5.005110502243042}
--- 25.236175775527954 seconds ---

You’re using `time.sleep`. This will block the thread. You want to use `await asyncio.sleep(sleep_time)`. — dirn, Apr 19 '21 at 12:34
yeah but i want the thread to block on the server end to simulate the app taking some time do something. What I dont get is that I have 5 workers available on the server side and only ever send at most 3 requests at the same time. So there should always be a worker free to pick up the request, sleep for that time then return the response. But sometimes that response time on the client side it 2 or 3 times slower, and its always by a factor of the sleep time, I.E if the sleep time is 5, its always 5, or 10 or 15, its never 7 or 12 etc. So i figure there must be something in the client side. — Chris Doyle, Apr 19 '21 at 12:41
However, having taken your comment and applied it in my code the results are a lot more consistent . So what is the difference between time.sleep in a worker and await asyncio.sleep.......I was thinking that each worker would get a seperate request, so a sleep blocking that thread wouldnt affect the others. — Chris Doyle, Apr 19 '21 at 12:45
Feel free to post that as an answer as its 100% solved my issue. any further explanation about why it fixes it or why sleep was causing the issue would greatly be appreciated. — Chris Doyle, Apr 19 '21 at 12:49

jsbueno · Accepted Answer · 2021-04-19T18:06:38.483

Continuing from the comments:

You’re using time.sleep. This will block the thread. You want to use await asyncio.sleep(sleep_time). - @dirn (this is the answer)

However, having taken your comment and applied it in my code the results are a lot more consistent . So what is the difference between time.sleep in a worker and await asyncio.sleep.......I was thinking that each worker would get a separate request, so a sleep blocking that thread wouldn't affect the other - . @Chris Doyle

While I had not checked how uvunicorn sets up its workers by default, the whole point of using async for coding is doing everything - or as much as possible - on the same thread. Since your views are defined as async that is specially true - this coding paradigm do not expect async functions to block when waiting for any call - the thread will be stopped.

If instead of sleep you have to call some code that will take time, and is not async ready, then you can make that call asynchronous by running the delegate function into a separate thread. Python asyncio makes that easy by providing the loop.run_in_executor call: it will automatically create a thread-based executor with a thread pool, and run your blocking function in a separate thread, while freeing up the current thread for the async event loop to further orchestrate other tasks/workers.

Otherwise, as said, if you are using sleep for test purposes, just await for asyncio.sleep instead. You can use await loop.run_in_executor(None, time.sleep, 5) to check how it would behave with a blocking function.

async http call taking twice as long as it should

1 Answers1