0

I have a web app that runs on flask and a small amount of JQuery.

The app uses an API to gather data and returns a graph to users. For smallish requests, everything works. However, I'd like to enable a user to ask for a large collection of data. When the user asks for say 100 data points, the program needs to run for as along as 10 minutes.

Before the program is finished, a 502 error is thrown.

Additional details: I've setup logging and can see the program continuing to run and collect data after the 502. I'm also using ajax to call the offending long running flask function with an alert to trigger on 502, which is how I know the error code.

Question: Is there a solution to this problem ?

TIA

wbg
  • 866
  • 3
  • 14
  • 34

2 Answers2

2

There is. You probably want to use an async task queue like Celery.

Imagine this is the function that returns a response to the user:

def respond():

    data = long_running_work()  # Blocks the return on long waits
    return Response(data)
    data = long_running_work()  # Doesn't get executed.

There is no place for you to put the long-running work in the request/response cycle if it will take longer than the timeout the client expects. On the other hand:

def respond():

    # Shoves a message into rabbitmq 
    # to be dealt with by someone else
    send_task()  # takes no time, no blocking  

    return Response('Please wait')

send_task is responsible for setting up a background process to do the work and make it available somewhere (on s3, on your server, wherever). In the client-side js you could display a spinner and 'waiting...' message while polling the expected place in the background, and display when it is ready. This way, nothing times out or gets corrupted.

For a longer (and better) explanation, consider this.

bwarren2
  • 1,347
  • 1
  • 18
  • 36
  • I think this is the robust solution. This app runs on openshift, I think this will work. Will read about Celery. – wbg Dec 30 '15 at 23:01
  • Yeah, celery is the way to go. – cdvv7788 Dec 30 '15 at 23:03
  • However, after some initial reading, it's a bit complex...Is there no other way? – wbg Dec 30 '15 at 23:11
  • 1
    AFAIK it is the best way. Celery is powerful and it shows up as complexity in the docs, but you should only need the rudiments: two functions decorated with the @task annotation, one to do the work and one to store it somewhere the client can get it. Call the first with the second chained on in our respond() function above, have a celery worker process and rabbitmq running, and everything should be good. You're venturing into async world, which is hard, but good to understand. See also [this](https://blogs.vmware.com/vfabric/2013/04/how-instagram-feeds-work-celery-and-rabbitmq.html) – bwarren2 Dec 30 '15 at 23:17
0

502 HTTP Status Code if you increase the timeout of the request, or in other way use pagination

Community
  • 1
  • 1
Totodile
  • 150
  • 1
  • 7