1

I have a FastAPI endpoint like this:

@app.get("/api/triggers")
async def get_triggers(s_id: str = Query(None), trigger_id: str = None, limit: str = None, offset: int = None,
                    session: Session = Depends(get_db)):
    if trigger_id:
        if "," in trigger_id:
            trigger_id = trigger_id.replace(" ", "").split(',')
        else:
            trigger_id = [trigger_id]
        trigger_info = session.query(Trigger).filter(Trigger.id.in_(trigger_id)).offset(offset).limit(limit).all()
        for key in trigger_info:
            del key.geometry_global
        return trigger_info
    if s_id:
        if "," in s_id:
            s_id = s_id.split(',')
        else:
            s_id = [s_id]
        result = session.query(Trigger).options(load_only("id", "geometry_global")).offset(offset).limit(limit). \
            filter(Trigger.s_id_fk.in_(s_id)).all()
    else:
        result = session.query(Trigger).options(load_only("id", "geometry_global")).offset(offset).limit(limit).all()
    return ORJSONResponse(result)

How can I speed it up? There are million of rows in the table and json response seems to be very slow. I have added pagination, so that chunks of data can be retrieved by multiple api calls. But Is there any other way by which response can be speed up?

Chris
  • 18,724
  • 6
  • 46
  • 80
Deb
  • 91
  • 13
  • 1
    is the problem related to the `SQL` query? – Danila Ganchar May 02 '23 at 09:27
  • 1
    If query can be optimized that would be good, but the main issue is that the JSON response is very large, which takes a lot of time to be sent. – Deb May 02 '23 at 09:48
  • 1
    are you sure that the main delay is when sending a response, and not in processing, `json.dumps()` etc? – Danila Ganchar May 02 '23 at 09:55
  • Does this answer your question? [FastAPI is very slow in returning a large amount of JSON data](https://stackoverflow.com/questions/73564771/fastapi-is-very-slow-in-returning-a-large-amount-of-json-data) – Chris May 02 '23 at 09:58
  • You may also find [this answer](https://stackoverflow.com/a/73694164/17865804) and [this answer](https://stackoverflow.com/a/73974946/17865804) helpful – Chris May 02 '23 at 10:00
  • 1
    May I ask how exactly are you testing the endpoint? Is it through Swagger UI, or accessing the endpoint directly from the browser, or some other means? If it is either one of the first two, the delay you experience has mostly to do with the browser displaying the data (hence, client side issue), as explained in the first link above. If so, why would you want to display _"millions of rows"_ in a browser, instead of returning them as an `attachment` file that the client could download? Please have a look at all the links provided above, which provide examples and explanation around this topic. – Chris May 02 '23 at 11:14
  • The api is being called in a react app, where we have a map & all the data needs to be plotted. So it takes some time to plot the data. I have looked at the answers provideed above. I added ORJSONresponse, but it didn't help much. Maybe i need to look at other serialization techniques like protobuf & flatbuffer along with grpc apis instead of rest. – Deb May 03 '23 at 03:54

1 Answers1

0

Problem overview

If you are accessing the endpoint directly from the browser, the delay you may experience has mostly to do with the browser displaying the data, as you mentioned there are millions of rows to be returned; hence, this is a client-side issue and not so much a server-side one.

Suggestions

  1. Instead of returning an ORJSONResponse, which would result in displaying the data in the browser, you could instead return a custom Response and set the Content-Disposition response header, using the attachment parameter, indicating to the browser that the data should be downloaded as a file, instead of viewed in the browser. Please have a look at this answer, as well as here and here for more details on returning JSON data and how to speeed up this process. Example:

    import orjson
    # ...
    results = session.query(...
    headers = {'Content-Disposition': 'attachment; filename="data.json"'}
    return Response(orjson.dumps(results), headers=headers, media_type="application/json")
    
  2. If you still need to display the results in the browser, I would suggest having a frontend—which could easily be created using Jinja2 templates and HTML, as shown here and here—and instead of fetching all the results at once (since there are millions of rows, which would not make any sense to return them all at once anyway), rather fetch a few rows at a time (e.g., 1000 rows) and have a Load more button on which users can click to fetch more results (or use pagination). Additionally, provide a filter mechanism that would allow users to filter the results as desired, and hence, retrieve only those results they are interested in, resulting in faster query execution and less data to be returned.

Chris
  • 18,724
  • 6
  • 46
  • 80