1

I have a Flask app that is displaying some data as a png. Below is a simple dummy example:

import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
from io import BytesIO

import logging
logger = logging.getLogger(__name__)

def specimage():
    plt.clf()
    plt.plot(np.arange(3000), np.random.random(3000), drawstyle="steps-post")
    plt.loglog()
    out = BytesIO()
    logger.debug("Rendering...")
    plt.savefig(out, format='png')
    logger.debug("Done rendering")
    out.seek(0)
    return out.getvalue()

if I run this interactively, the function takes maybe 1 second to run. But if I call it from a flask app:

import render
@app.route('/getfig')
def getfig():
    return Response(render.specimage(), content_type="application/png")

Now the call to savefig takes at least 15 seconds, with the CPU pegged at 100% the whole time!

If I take this view and insert it into a very basic Flask app, it runs at a reasonable speed. But my app is quite complicated. Where can I look to find weird interactions that could be slowing matplotlib down so much?

EDIT A bit more information that might be relevant. While the image is being rendered, the application has launched a separate thread doing some lengthy calculations. It looks like the image generating hangs until this other thread finishes.

So that raises the question, why is matplotlib stuck? I can access any other view in my flask app and it returns instantly, so in general the threading is working correctly. Am I running into some weird GIL issue here? The other thread is doing some zlib compression, and I think PNG uses zlib as well, so, maybe that's the issue?

EDIT2 Removing zlib calls from the other thread didn't help, so it's not zlib's fault. The other thread is not calling any matplot functions, but it is in the same module as the plotting thread, meaning matplotlib.pyplot is imported there...

thegreatemu
  • 495
  • 2
  • 11
  • are you running the flask app locally? – Paul H Sep 10 '20 at 20:02
  • 1
    @PaulH yes. And the debug statements surrounding `savefig` confirm that that's the problematic part, not any kind of network latency – thegreatemu Sep 10 '20 at 20:16
  • 1
    can you do a `plt.draw()` outside of your logging statement to isolate if it's the rendering or the saving? – Paul H Sep 10 '20 at 20:38
  • 2
    I was also able to run this as a very basic flask app, and it is instant. I think it's impossible to answer this question without a greater view of the code. Do you happen to have a repo available? – v25 Sep 10 '20 at 20:50
  • 1
    `plt.draw()` is also slow – thegreatemu Sep 10 '20 at 20:51
  • the application is at https://github.com/bloer/bgexplorer ...it's a pretty big mess – thegreatemu Sep 10 '20 at 20:52
  • Sorry, the application is there, but not the image generation part; that's what I'm trying (and failing) to add now – thegreatemu Sep 10 '20 at 20:58
  • The CPU being pegged is useful data. Is there any chance that the other thread is spending lots of time in external code where the GIL might not be involved? – Dave W. Smith Sep 10 '20 at 21:03
  • It's getting data from pymongo and doing calculations with numpy. I would guess both of those would release the GIL – thegreatemu Sep 10 '20 at 21:07
  • I remember having some trouble with matplotlib and threads. [this answer](https://stackoverflow.com/a/34769067/2052575) suggests it's not threadsafe. The behaviour I was seeing was a route that generates a chart in png, when called 3 times from an index page (to place 3 charts on the page) lead to points from each chart crossing onto other charts and other crazy weird behaviour. My solution was to launch `flask run --without-threads` and later use a WSGI worker type which wasn't threaded. This probably doesn't help in your case, as your app seems to be designed around threads. – v25 Sep 10 '20 at 21:14

1 Answers1

0

Re:

  1. On Matplotlib side, savefig(): you can make sure you are closing it: plt.close()

  2. On the Flask side: you can make use: Flask-Caching:

    from flask_caching import Cache
    cache = Cache(config={'CACHE_TYPE': 'SimpleCache'})
    app = Flask(__name__)
    cache.init_app(app)
    @app.route("/")
    @cache.cached(timeout=55)
    def index():
    ................
    
Daniel Moraite
  • 422
  • 5
  • 8