0

I have a Flask API witch the endpoints calls a function like:

@app.route('/Ads', methods=['POST'])
def ads():
    data = request.get_json()
    return execute_action(etl_ads, data)    

def execute_action(action, *args):
    try:
        logging.info('Starting {0}: {1}'.format(str(action.__name__), __try_parse_str(*args)))
        with futures.ProcessPoolExecutor(max_workers=5) as executor:
            result = executor.submit(action, *args).result()        
            logging.info('Finished {0}'.format(str(action.__name__)))
            return result        
    except Exception as e:
        logging.error('Error on {0}: {1}'.format(str(action.__name__), str(e)))
        return json.dumps({'error': '{0}'.format(str(e))}), 500, {'ContentType': 'application/json'}

The body of this request is:

{
    "file_urls": ["blob_url"],
    "meta_data": {
        "job_creation_time": "2022-11-08T09:36:00"
    },
    "company_tenant": { 
        "id": "tenant_id" 
    }
}

The etl_ads is a function that downloads the content of the urls (each URL is a JSON file), transforms the data and save it on BigQuery. The process is the same to all endpoints. The difference is how the data is transformed, that's why the funcion as a parameter to execute_action.

When I tried to use GUnicorn or Waitress to host the app instead of using the Flask development server the code got slower (I'm using K6 to execute performance tests).

When I remove the futures the problem stops: the code starts to run faster with GUnicorn and Waitress. Why is this happening? What's de difference between Flask development server and those WSGIs that they are slower when using future? I also tried to use gevent directly (based on this answer), but the result was the same.

Salatiel
  • 121
  • 2
  • 10

0 Answers0