23

So I'm real green with file I/O and memory limits and the such, and I'm having a rough time getting my web application to successfully serve large file downloads to a web browser with flask's make_response. The following code works on smaller files (<~1GB), but gives me a MemoryError Exception when I get into larger files:

raw_bytes = ""
with open(file_path, 'rb') as r:
    for line in r:
        raw_bytes = raw_bytes + line
response = make_response(raw_bytes)
response.headers['Content-Type'] = "application/octet-stream"
response.headers['Content-Disposition'] = "inline; filename=" + file_name
return response

I'm assuming that sticking over 2 GB worth of binary data into a string is probably a big no-no, but I don't know an alternative to accomplishing these file download black magicks. If someone could get me on the right track with a chunky[?] or buffered approach for file downloads, or just point me toward some intermediate-level resources to facilitate a deeper understanding of this stuff, I would greatly appreciate it. Thanks!

SheffDoinWork
  • 743
  • 2
  • 8
  • 19
  • 3
    Why read the file at all? You can just use the [`send_file()`](https://flask.readthedocs.org/en/latest/api/#flask.send_file) function to have Flask serve the file *for you*. – Martijn Pieters Jun 20 '14 at 10:29
  • @MartijnPieters Because I was unaware of the existence of such a function :D I just reworked the implementation to use it and now I've another issue; it seems to be remembering the first file of a given type that is opened and using that for all subsequent files of that type. ie, if I have a bunch of png files, a pdf file, and a txt file, the application will successfully download the first correct png, and then serve the same image for every other, _distinct_ png file. On the server side, I've verified that the send_file function gets the correct path to the pngs, but it still misbehaves. – SheffDoinWork Jun 20 '14 at 16:29
  • Then something else is still wrong; that's not behaviour that `send_file()` on its own can produce. – Martijn Pieters Jun 20 '14 at 16:30

2 Answers2

34

See the docs on Streaming Content. Basically, you write a function that yields chunks of data, and pass that generator to the response, rather than the whole thing at once. Flask and your web server do the rest.

from flask import stream_with_context, Response

@app.route('/stream_data')
def stream_data():
    def generate():
        # create and return your data in small parts here
        for i in xrange(10000):
            yield str(i)

    return Response(stream_with_context(generate()))

If the file is static, you can instead take advantage of send_from_directory(). The docs advise you to use nginx or another server that supports X-SendFile, so that reading and sending the data is efficient.

davidism
  • 121,510
  • 29
  • 395
  • 339
5

The problem in your attempt is, that you are first reading complete content into "raw_bytes", so with large files you are easy to exhaust all the memory you have.

There are multiple options to resolve that:

Streaming the content

As explained by davidism answer, you can use a generator passed int Response. This serves the large file piece by piece and does not require so much memory.

The streaming can go not only from a generator, but also from a file, as shown in this anwer

Serving static files over flask

In case your file is static, search for how to configure Flask to serve static files. These shall be automatically served in streaming manner.

Serving static files over apache or nginx (or other web server)

Assuming, the file is static, you shall in production serve it by reverse proxy in front of your Flask app. This not only offloads your app, but also works much faster.

Community
  • 1
  • 1
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98