0

So I'm building out a Flask API that returns data from a MySQL db. Instead of using the built-in jsonify from Flask, I have opted to run with json.dumps as that appears considerably faster (likely b/c it is not pretty printing anything).

Whereas my SQL instance can spit-out a large data selection in ~1-1.2 seconds (~50,000 rows), my R code is running at about 5s to return that same data. If I use a small to_tibble function to transform the returned list into a tibble, that time jumps to about 6.5-7s. It runs at 0.25-0.50s for smaller returns of <5,000 rows.

That time split is not the end of the world, but it is very exacerbated when dealing with much larger data. I understand that returning lengthy JSON is resource-intensive, but are there any "best practices" to speed-up the JSON return on the Flask end? My SQL tables have proper indexes and primary keys.

My return statement on an query execution function is just: return Response(dumps(data, default=str), mimetype='application/json').

Not sure what other information would be relevant to the cause here but willing to share anything else. The SQL instance is running on 4GB of RAM, and the Flask app is on 2GB (I think?).

codeweird
  • 145
  • 3
  • 11

1 Answers1

1

If you can use Pandas or NumPy (and SQLAlchemy or similar; you likely already have a compatible connector in your current implementation), they (Pandas) can do a very fast JSON export with a dedicated method

Then, by building the response yourself to avoid reinterpreting it and can directly return the text

Perhaps as simple as

import pandas as pd
import sqlalchemy

# in route
    engine = sqlalchemy.create_engine(connection_details)
    df = pandas.read_sql(query, engine)
    json_string = df.to_json()
    response = app.response_class(
        response=json_string,
        mimetype="application/json",
    )
    return response
ti7
  • 16,375
  • 6
  • 40
  • 68