54

I'm building a fairly simple WebApp in Flask that performs functions via a website's API. My users fill out a form with their account URL and API token; when they submit the form I have a python script that exports PDFs from their account via the API. This function can take a long time so I want to display a bootstrap progress bar on the form page indicating how far along in the process the script is. My question is how to I update the progress bar as the function is running? Here is a simplified version of what I'm talking about.

views.py:

@app.route ('/export_pdf', methods = ['GET', 'POST'])
def export_pdf():
    form = ExportPDF()
    if form.validate_on_submit():
      try:
        export_pdfs.main_program(form.account_url.data,
          form.api_token.data)
        flash ('PDFs exported')
        return redirect(url_for('export_pdf'))
      except TransportException as e:
        s = e.content
        result = re.search('<error>(.*)</error>', s)
        flash('There was an authentication error: ' + result.group(1))
      except FailedRequest as e:
        flash('There was an error: ' + e.error)
    return render_template('export_pdf.html', title = 'Export PDFs', form = form)

export_pdf.html:

{% extends "base.html" %}

{% block content %}
{% include 'flash.html' %}
<div class="well well-sm">
  <h3>Export PDFs</h3>
  <form class="navbar-form navbar-left" action="" method ="post" name="receipt">
    {{form.hidden_tag()}}
    <br>
    <div class="control-group{% if form.errors.account_url %} error{% endif %}">
      <label class"control-label" for="account_url">Enter Account URL:</label>
      <div class="controls">
        {{ form.account_url(size = 50, class = "span4")}}
        {% for error in form.errors.account_url %}
          <span class="help-inline">[{{error}}]</span><br>
        {% endfor %}
      </div>
    </div>
    <br>
    <div class="control-group{% if form.errors.api_token %} error{% endif %}">
      <label class"control-label" for="api_token">Enter API Token:</label>
      <div class="controls">
        {{ form.api_token(size = 50, class = "span4")}}
        {% for error in form.errors.api_token %}
          <span class="help-inline">[{{error}}]</span><br>
        {% endfor %}
      </div>
    </div>
    <br>
    <button type="submit" class="btn btn-primary btn-lg">Submit</button>
  <br>
  <br>
  <div class="progress progress-striped active">
  <div class="progress-bar"  role="progressbar" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100" style="width: 0%">
    <span class="sr-only"></span>
  </div>
</form>
</div>
</div>
{% endblock %}

and export_pdfs.py:

def main_program(url, token):
    api_caller = api.TokenClient(url, token)
    path = os.path.expanduser('~/Desktop/'+url+'_pdfs/')
    pdfs = list_all(api_caller.pdf.list, 'pdf')
    total = 0
    count = 1
    for pdf in pdfs:
        total = total + 1
    for pdf in pdfs:
        header, body = api_caller.getPDF(pdf_id=int(pdf.pdf_id))
        with open('%s.pdf' % (pdf.number), 'wb') as f:
          f.write(body)
        count = count + 1
        if count % 50 == 0:
          time.sleep(1)

In that last function I have total the number of PDFs I will export, and have an ongoing count while it is processing. How can I send the current progress to my .html file to fit within the 'style=' tag of the progress bar? Preferably in a way that I can reuse the same tool for progress bars on other pages. Let me know if I haven't provided enough info.

FreshCrichard
  • 541
  • 1
  • 5
  • 4
  • 10
    I don't want to give a coded answer but let me point you towards a solution. One conventional idea is to start a thread to do pdf export. The thread reports the progress to a database table. Your browser front end does ajax polling to get the progress value from the database. The alternative to ajax polling, you may want to look at flask-socketio to push progress value down to your browser. This alternative might require more engineering effort. – chfw Sep 02 '14 at 22:13
  • 1
    The idea of @chfw is how you should approach it. But rather than a thread it should be an extra process that's waiting for jobs. And rather than a database I'd use something like Redis and communicate via message queues. And finally rather than using AJAX or WebSockets I'd recommend [SSE](https://developer.mozilla.org/en-US/docs/Server-sent_events) which is easier to setup. – dAnjou Oct 11 '14 at 14:51
  • 1
    @FreshCrichard - How did you get it running ultimately? – Ira Jan 10 '20 at 23:15

3 Answers3

37

As some others suggested in the comments, the simplest solution is to run your exporting function in another thread, and let your client pull progress information with another request. There are multiple approaches to handle this particular task. Depending on your needs, you might opt for a more or less sophisticated one.

Here's a very (very) minimal example on how to do it with threads:

import random
import threading
import time

from flask import Flask


class ExportingThread(threading.Thread):
    def __init__(self):
        self.progress = 0
        super().__init__()

    def run(self):
        # Your exporting stuff goes here ...
        for _ in range(10):
            time.sleep(1)
            self.progress += 10


exporting_threads = {}
app = Flask(__name__)
app.debug = True


@app.route('/')
def index():
    global exporting_threads

    thread_id = random.randint(0, 10000)
    exporting_threads[thread_id] = ExportingThread()
    exporting_threads[thread_id].start()

    return 'task id: #%s' % thread_id


@app.route('/progress/<int:thread_id>')
def progress(thread_id):
    global exporting_threads

    return str(exporting_threads[thread_id].progress)


if __name__ == '__main__':
    app.run()

In the index route (/) we spawn a thread for each exporting task, and we return an ID to that task so that the client can retrieve it later with the progress route (/progress/[exporting_thread]). The exporting thread updates its progress value every time it thinks it is appropriate.

On the client side, you would get something like this (this example uses jQuery):

function check_progress(task_id, progress_bar) {
    function worker() {
        $.get('progress/' + task_id, function(data) {
            if (progress < 100) {
                progress_bar.set_progress(progress)
                setTimeout(worker, 1000)
            }
        })
    }
}

As said, this example is very minimalistic and you should probably go for a slightly more sophisticated approach. Usually, we would store the progress of a particular thread in a database or a cache of some sort, so that we don't rely on a shared structure, hence avoiding most of the memory and concurrency issues my example has.

Redis (https://redis.io) is an in-memory database store that is generally well-suited for this kind of tasks. It integrates ver nicely with Python (https://pypi.python.org/pypi/redis).

Alvae
  • 1,254
  • 12
  • 22
  • 1
    Amazing. Thank you. – Nicholas Morley Aug 21 '17 at 15:55
  • This answer has worked well for me, except I'm struggling to figure out how you stop the setTimeout loop in the event of a server-side error. – Jon Behnken Oct 31 '19 at 15:34
  • Hi @Alvae, this method is great and very simple and I am currently implementing it. I have a question tho, will the memory related to the thread ever be freed? I mean is this safe? how can I free it after a period of time – Francesco Pegoraro Feb 07 '20 at 14:32
  • @FrancescoPegoraro If your thread completes then its memory will be reclaimed with the usual garbage collection mechanism. You do not need explicit memory deallocation in Python (or Javascript). Whether or not it is safe is a much harder question, which depends on your definition of "safe". – Alvae Feb 12 '20 at 16:05
2

I made a working and tested example using threading. Just copy paste and change at will.

Python

from flask import Flask, render_template
from threading import Thread
from time import sleep
import json

app = Flask(__name__)
status = None

def task():
  global status
  for i in range(1,11):
    status = i
    sleep(1)

@app.route('/')
def index():
  t1 = Thread(target=task)
  t1.start()
  return render_template('index.html')
  
@app.route('/status', methods=['GET'])
def getStatus():
  statusList = {'status':status}
  return json.dumps(statusList)

if __name__ == '__main__':
  app.run(debug=True)

HTML CSS JS

<!doctype html>
<html>

<head>
  <meta charset="UTF-8">

  <style>
  
  body {
    background-color: #D64F2A;
  }
  
  .progress {
    display: flex;
    position: absolute;
    height: 100%;
    width: 100%;
  }
  
  .status {
    color: white;
    margin: auto;
  }

  .status h2 {
    padding: 50px;
    font-size: 80px;
    font-weight: bold;
  }
  
  </style>

  <title>Status Update</title>

</head>

<body>
  <div class="progress">
    <div class="status">
      <h2 id="innerStatus">Loading...</h2>
    </div>
  </div>
</body>

<script>
var timeout;

async function getStatus() {

  let get;
  
  try {
    const res = await fetch("/status");
    get = await res.json();
  } catch (e) {
    console.error("Error: ", e);
  }
  
  document.getElementById("innerStatus").innerHTML = get.status * 10 + "&percnt;";
  
  if (get.status == 10){
    document.getElementById("innerStatus").innerHTML += " Done.";
    clearTimeout(timeout);
    return false;
  }
   
  timeout = setTimeout(getStatus, 1000);
}

getStatus();
</script>

</html>
D.Snap
  • 1,704
  • 1
  • 22
  • 15
1

I run this simple but educational Flask SSE implementation on localhost. To handle 3rd party (user uploaded) library in GAE:

  1. Create a directory named lib in your root path.
  2. copy gevent library directory to lib directory.
  3. Add these lines to your main.py:

    import sys
    sys.path.insert(0,'lib')
    
  4. Thats all. If you use lib directory from a child folder, use relative reference: sys.path.insert(0, ../../blablabla/lib')

From http://flask.pocoo.org/snippets/116/

# author: oskar.blom@gmail.com
#
# Make sure your gevent version is >= 1.0
import gevent
from gevent.wsgi import WSGIServer
from gevent.queue import Queue

from flask import Flask, Response

import time


# SSE "protocol" is described here: http://mzl.la/UPFyxY
class ServerSentEvent(object):

    def __init__(self, data):
        self.data = data
        self.event = None
        self.id = None
        self.desc_map = {
            self.data : "data",
            self.event : "event",
            self.id : "id"
        }

    def encode(self):
        if not self.data:
            return ""
        lines = ["%s: %s" % (v, k) 
                 for k, v in self.desc_map.iteritems() if k]

        return "%s\n\n" % "\n".join(lines)

app = Flask(__name__)
subscriptions = []

# Client code consumes like this.
@app.route("/")
def index():
    debug_template = """
     <html>
       <head>
       </head>
       <body>
         <h1>Server sent events</h1>
         <div id="event"></div>
         <script type="text/javascript">

         var eventOutputContainer = document.getElementById("event");
         var evtSrc = new EventSource("/subscribe");

         evtSrc.onmessage = function(e) {
             console.log(e.data);
             eventOutputContainer.innerHTML = e.data;
         };

         </script>
       </body>
     </html>
    """
    return(debug_template)

@app.route("/debug")
def debug():
    return "Currently %d subscriptions" % len(subscriptions)

@app.route("/publish")
def publish():
    #Dummy data - pick up from request for real data
    def notify():
        msg = str(time.time())
        for sub in subscriptions[:]:
            sub.put(msg)

    gevent.spawn(notify)

    return "OK"

@app.route("/subscribe")
def subscribe():
    def gen():
        q = Queue()
        subscriptions.append(q)
        try:
            while True:
                result = q.get()
                ev = ServerSentEvent(str(result))
                yield ev.encode()
        except GeneratorExit: # Or maybe use flask signals
            subscriptions.remove(q)

    return Response(gen(), mimetype="text/event-stream")

if __name__ == "__main__":
    app.debug = True
    server = WSGIServer(("", 5000), app)
    server.serve_forever()
    # Then visit http://localhost:5000 to subscribe 
    # and send messages by visiting http://localhost:5000/publish
illright
  • 3,991
  • 2
  • 29
  • 54
guneysus
  • 6,203
  • 2
  • 45
  • 47