0

In my Flask application, a route accepts file uploads via HTTP PUT. Since those files will then be processed by an external tool, I need to save them to disk. Since the files can become larger (a couple hundred MB), I avoid using request.get_data() but instead use request.stream to write the incoming data to disk using constant memory:

Right now I do

from tempfile import NamedTemporaryFile
from flask import send_file

# ...

@app.route('/<path>', methods=['PUT'])
def handle_upload(path):
    try:
        with NamedTemporaryFile(delete=False) as tmpFile:
            while True:
                chunk = request.stream.read(16384)
                if not chunk:
                    break
                tmpFile.write(chunk)
        processFile(tmpFile.name) # This calls out to an external tool
        return send_file(tmpFile.name)
    finally:
        os.remove(tmpFile.name)

However, seeing that Flask features convenience methods like send_file, I wonder: does either Flask or Werkzeug maybe offer a ready-made function to stream the data of a request into a file, something shorter than a hand-written loop? I suspect I may be overlooking something in the API docs.

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207
  • You can't `PUT` file, but can `POST` (`PUT` method has a **3MB** limit). – dsgdfg Jan 31 '18 at 07:34
  • @dsgdfg Is that specified somewhere? In my experiments, there did not seem to be an upper limit on the size of the file being uploaded (I used e.g. `curl -T foo.dat http://myserver:12345`). – Frerich Raabe Jan 31 '18 at 07:38
  • You say that request body can be unlimited size. I think it's the limit of all the resources. – dsgdfg Jan 31 '18 at 07:42

2 Answers2

1

I had the same problem but found a better alternative: Handle it the same way as a POST upload and convert the stream into a werkzeug.FileStorage

incoming = FileStorage(request.stream)
with NamedTemporaryFile(delete=False) as tmpFile:
    incoming.save(tmpFile)

Note the description of save:

Save the file to a destination path or file object. If the destination is a file object you have to close it yourself after the call. The buffer size is the number of bytes held in memory during the copy process. It defaults to 16KB.

Flamefire
  • 5,313
  • 3
  • 35
  • 70
  • "The FileStorage class is a thin wrapper over incoming files. It is used by the request object to represent uploaded files." indeed sounds very much like what I was looking for. Thanks a lot for taking the time to post this answer! – Frerich Raabe Jun 05 '19 at 13:53
0

After researching some more and talking to various people, it seems to me that reading request.stream in chunks is the best way to save the data of a PUT request to a file, i.e.

for chunk in iter(lambda: request.stream.read(16384), bytes()):
    f.write(chunk)
Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207