0

I am currently developing an API that has several endpoints. One of them is to register data in a database, other endpoints are related to simple CRUD endpoints (get data by id, get all data, delete data, etc).

When the register data endpoint is called, a response is almost immediately given back to the API and then a background task is started, where we fetch the data, unzip it if necessary, etc.

We are using FastAPI and async functions for this. What I have noticed, though, is that the API gets blocked by the execution of the background task. This is especially bad when I am uploading a large file to S3 in one go (not in chunks, for which i use async functions), I have to wait for the end of the upload of the full file before another request can get a response (like when requesting the get all data endpoint)).

I am a nube in parallelism and concurrency, but I was expecting the background task not the block the API.

Any ideas on how I could run this long running background task in a way that it won't block new requests to the API? Would celery be best for this?

Mock example:

@app.post("register endpoint")
async def register_data(datainput, backgroundtasks: BackgroundTasks):
    #Do something
    backgroundtasks.add_task(background_operation)
    return JSONResponse(
            status_code=200, content="Doing stuff")
        )

async def background_operation():
   #Doing  stuff here
   await function_that_uploads_data_to_s3
J.Doe
  • 529
  • 4
  • 14

2 Answers2

0

Update: What worked for me what rewriting my background function (and by consequence, most of my code) not to be async (async def to def). This allowed the background task to be run in a separate thread, while allowing the API to still be responsive. Not sure if this is the best option, but it was the only thing that worked right now. In the future we are probably going to look into using celery for this or separating the service that is responsible for the API calls from another service that is responsible for the actual long running background operations.

J.Doe
  • 529
  • 4
  • 14
0

We had a similar problem, with the important difference that ours was CPU bound computation, unlike OP's file upload, which is IO Bound. And we use Flask not FastAPI, although the solution would work for either.

I ended up spawning a process for each request (after some basic validations), something like this :

@app.route('/start-job', methods=['POST'])
def start_job():

 validate_request()
 t2 = Process(target=self.compute,
                 args=(
                     a,b,c,d))

 t2.start()

 return "Job Started Successfully"

Probably not suited for production grade applications but you do get CPU parallelization with minimal effort.

Syed Saad
  • 107
  • 2
  • 9