0

Following this: Why do UVICORN/Starlette/FastAPI spawn more threads when not using "ASYNC" and don't when using "ASYNC"? I'm trying to create one-threaded asyncio app with fastAPI and langchain. I've noticed that as soon as I input even an import of langchain, I immediately get multiple processes, E.g., this code:

import asyncio
import random
import sys
from typing import Dict
import uvicorn
from fastapi import FastAPI
from fastapi.responses import HTMLResponse, JSONResponse
from loguru import logger
log_format = "{level} {process}-{thread} {time} {name}:{line} - {message}"
logger.remove()
logger.add(sys.stderr, format=log_format, backtrace=True, diagnose=True)
logger.add(
    "logs/" + "t_{time}.log",
    format=log_format,
    colorize=True,
    backtrace=True,
    diagnose=True,
)

app = FastAPI()
@app.get("/", response_class=HTMLResponse)
async def index():
    basic_side = """<!DOCTYPE html>
<html>
<body>

<h1>My First Heading</h1>

<p>My first paragraph.</p>

</body>
</html>
"""
    logger.info("received GET")
    return HTMLResponse(basic_side)

@app.post("/message")
async def message(json_data: Dict):
    value = await asyncio.sleep(random.randint(5, 30), result=100)
    logger.info("Received message")
    message = json_data["message"]
    return {"message": message[::-1]}


if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=5000)

Creates only one python thread:

ps -T -p 1755163
    PID    SPID TTY          TIME CMD
1755163 1755163 pts/0    00:00:00 python

While changing the code by simply adding on top:

import langchain

or anything from langchain, changes threads (even without any requests coming in) to:

ps -T -p 1755500
    PID    SPID TTY          TIME CMD
1755500 1755500 pts/0    00:00:01 python
1755500 1755501 pts/0    00:00:00 python
1755500 1755502 pts/0    00:00:00 python
1755500 1755503 pts/0    00:00:00 python
1755500 1755504 pts/0    00:00:00 python
1755500 1755505 pts/0    00:00:00 python
1755500 1755506 pts/0    00:00:00 python
1755500 1755507 pts/0    00:00:00 python

Interestingly enough, if I observe logs from the app, I do see only one thread, however when I complicate app by adding Chain from langchain and doing inside message function something like this:

preds = await chain.arun('text')

I see that as those calls getting longer (e.g. when rate limit on API is hit and they go to sleep), more and more threads got created and then around ~30-40 app gets completely stuck. Even calls to "GET /" are not answered. My understanding that in this case I'm hitting FastAPI behavior which is not "async" but just split into those 40 threads and then when all of them are busy - app is completely stuck.

What I can't understand:

  • why even simple import would cause creation of multiple threads
  • why in logs I see only one thread ID
bob
  • 15
  • 5

0 Answers0