0

Below is relevant snippet from my code. I am following what is mentioned in the documentation. Based on relevant question, I feel too many threads may be an issue. But, I am not sure, how can I limit those. I can't use run_in_executor + ThreadPoolExecutor because handling pandas dataframe is difficult in that. Can anyone advise how to solve this blocking issue?

app = FastAPI()

async def foo(param_1, param_2):
    df_a, df_b = await run_in_threadpool(bar, param_1, param_2)
    df = df_a.merge(df_b, on='name', how='left')
    return df

@app.get('/get_foo')
async def get_price_comparison():

    for param_1, param_2 in list_of_required_entry:
        tasks.append(asyncio.create_task(foo(param_1, param_2))
    data = await asyncio.gather(*tasks)

    data_dataframe = reduce(lambda left, right: pd.merge(left, right, on='name', how='left'), data)
    json_data = data_dataframe.to_dict(orient='records')
    return json_data
r ram
  • 71
  • 5
  • The reason for being too slow might have to do with the way of converting the DataFrame and returning the data back to the client. You could use `time.time()` to measure the elapsed time between two points. Have a look at the last sections of [this](https://stackoverflow.com/a/70667530/17865804) and [this](https://stackoverflow.com/a/73443824/17865804) on how to do that. – Chris Sep 03 '23 at 03:30
  • Please have a look at [this answer](https://stackoverflow.com/a/71205127/17865804), [this answer](https://stackoverflow.com/a/73694164/17865804) and [this answer](https://stackoverflow.com/a/73580096/17865804) on how to return JSON/DataFrame data in FastAPI. You might find [this answer](https://stackoverflow.com/a/73974946/17865804) and [this answer](https://stackoverflow.com/a/71517830/17865804) helpful as well. – Chris Sep 03 '23 at 03:33
  • ok. Looking at all those answers, I was looking at this answer as well, could this be the reason: https://stackoverflow.com/questions/70927983/fastapi-run-in-threadpool-getting-stuck – r ram Sep 03 '23 at 04:04

0 Answers0