I have a Flask application with reads a dataframe and provide it in a service. The problem is that I need to update it (only a reading from s3) with some frequency. And in this between time the dataframe need to be avaliable, or the service could return some kind of error. Maybe is possible with some sort of parallelism. My code is similar to this one:
from flask import Flask, request, make_response
import pandas as pd
# this dataframe needs to be updated
df = pd.read_parquet("s3://data/data.parquet.gzip")
app = Flask(__name__)
# this application needs to be avaiable in the df update
@app.route('/application', methods=["POST"])
def application():
data = request.json
return make_response(function_(df, data), 200)
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8080)