Parallelize work within a Flask view with asyncio

Question

I am working on a Flask app in which the response to the client depends on replies that I get from a couple of external APIs. The requests to these APIs are logically independent from each other, so a speed gain can be realized by sending these requests in parallel (in the example below response time would be cut almost in half).

It seems to me the simplest and most modern way to achieve this is to use asyncio and process all work in a separate async function that is called from the flask view function using asyncio.run(). I have included a short working example below.

Using celery or any other type of queue with a separate worker process does not really make sense here, because the response has to wait for the API results anyway before sending a reply. As far as I can see this is a variant of this idea where a processing loop is accessed through asyncio. There are certainly applications for this, but I think if we really just want to parallelize IO before answering a request this is unnecessarily complicated.

However, I know that there can be some pitfalls in using various kinds of multithreading from within Flask. Therefore my questions are:

Would the implmentation below be considered safe when used in a production environment? How does that depend on the kind of server that we run Flask on? Particularly, the built-in development server or a typical multi-worker gunicorn setup such as suggested on https://flask.palletsprojects.com/en/1.1.x/deploying/wsgi-standalone/#gunicorn?
Are there any considerations to be made about Flask's app and request contexts in the async function or can I simply use them as I would in any other function? I.e. can I simply import current_app to access my application config or use the g and session objects? When writing to them possible race conditions would clearly have to be considered, but are there any other issues? In my basic tests (not in example) everything seems to work alright.
Are there any other solutions that would improve on this?

Here is my example application. Since the ascynio interface changed a bit over time it is probably worth noting that I tested this on Python 3.7 and 3.8 and I have done my best to avoid deprecated parts of asyncio.

import asyncio
import random
import time
from flask import Flask

app = Flask(__name__)

async def contact_api_a():
    print(f'{time.perf_counter()}: Start request 1')
    # This sleep simulates querying and having to wait for an external API
    await asyncio.sleep(2)

    # Here is our simulated API reply
    result = random.random()

    print(f'{time.perf_counter()}: Finish request 1')

    return result


async def contact_api_b():
    print(f'{time.perf_counter()}: Start request 2')
    await asyncio.sleep(1)

    result = random.random()

    print(f'{time.perf_counter()}: Finish request 2')

    return result


async def contact_apis():
    # Create the two tasks
    task_a = asyncio.create_task(contact_api_a())
    task_b = asyncio.create_task(contact_api_b())

    # Wait for both API requests to finish
    result_a, result_b = await asyncio.gather(task_a, task_b)

    print(f'{time.perf_counter()}: Finish both requests')

    return result_a, result_b


@app.route('/')
def hello_world():
    start_time = time.perf_counter()

    # All async processes are organized in a separate function
    result_a, result_b = asyncio.run(contact_apis())

    # We implement some final business logic before finishing the request
    final_result = result_a + result_b

    processing_time = time.perf_counter() - start_time

    return f'Result: {final_result:.2f}; Processing time: {processing_time:.2f}'

score 0 · Accepted Answer · answered Sep 07 '20 at 14:18

This will be safe to run in production but asyncio will not work efficiently with the Gunicorn async workers, such as gevent or eventlet. This is because the result_a, result_b = asyncio.run(contact_apis()) will block the gevent/eventlet event-loop until it completes, whereas using the gevent/eventlet spawn equivalents will not. The Flask server shouldn't be used in production. The Gunicorn threaded workers (or multiple Gunicorn processes) will be fine, as asyncio will block the thread/process.
The globals will work fine as they are tied to either the thread (threaded workers) or green-thread (gevent/eventlet) and not to the asyncio task.
I would say Quart is an improvement (I'm the Quart author). Quart is the Flask API re-implemented using asyncio. With Quart the snippet above is,

import asyncio
import random
import time
from quart import Quart
    
app = Quart(__name__)
    
async def contact_api_a():
    print(f'{time.perf_counter()}: Start request 1')
    # This sleep simulates querying and having to wait for an external API
    await asyncio.sleep(2)

    # Here is our simulated API reply
    result = random.random()

    print(f'{time.perf_counter()}: Finish request 1')

    return result
    
async def contact_api_b():
    print(f'{time.perf_counter()}: Start request 2')
    await asyncio.sleep(1)

    result = random.random()

    print(f'{time.perf_counter()}: Finish request 2')

    return result
    

async def contact_apis():
    # Create the two tasks
    task_a = asyncio.create_task(contact_api_a())
    task_b = asyncio.create_task(contact_api_b())

    # Wait for both API requests to finish
    result_a, result_b = await asyncio.gather(task_a, task_b)

    print(f'{time.perf_counter()}: Finish both requests')

    return result_a, result_b
    
@app.route('/')
async def hello_world():
    start_time = time.perf_counter()

    # All async processes are organized in a separate function
    result_a, result_b = await contact_apis()

    # We implement some final business logic before finishing the request
    final_result = result_a + result_b

    processing_time = time.perf_counter() - start_time

    return f'Result: {final_result:.2f}; Processing time: {processing_time:.2f}'

I'd also suggest using an asyncio based request library such as httpx

Thanks for the detailed answer! Your comment on gevent/eventlet is well-taken. I am indeed currently using gevent in my setup, hence I will have to think a little bit about how to handle this potential performance hit (in my case simply ignoring it might be a reasonable solution, I'll have to do some testing). I am very interested in Flask-API based async approaches such as Quart and will likely try it on a future project, but alas, there is an existing code base here. — m_o_h, Sep 07 '20 at 15:57

Parallelize work within a Flask view with asyncio

1 Answers1