4

We have a react application communicating with a django backend. Whenever the react application wants to upload a file to the backend, we send a form request with one field being the handle of the file being upload. The field is received on the Django side as an InMemoryUploadedFile, which is an object with some chunks, which can be processed for example like this:

def save_uploaded_file(uploaded_file, handle):
    """
    Saves the uploaded file using the given file handle.
    We walk the chunks to avoid reading the whole file in memory
    """
    for chunk in uploaded_file.chunks():
        handle.write(chunk)
    handle.flush()
    logger.debug(f'Saved file {uploaded_file.name} with length {uploaded_file.size}')

Now, I am creating some testing framework using requests to drive our API. I am trying to emulate this mechanism, but strangely enough, requests insists on reading from the open handle before sending the request. I am doing:

requests.post(url, data, headers=headers, **kwargs)

with:

data = {'content': open('myfile', 'rb'), ...}

Note that I am not reading from the file, I am just opening it. But requests insists on reading from it, and sends the data embedded, which has several problems:

  • it can be huge
  • by being binary data, it corrupts the request
  • it is not what my application expects

I do not want this: I want requests simply to "stream" that file, not to read it. There is a files parameter, but that will create a multipart with the file embedded in the request, which is again not what I want. I want all fields in the data to be passed in the request, and the content field to be streamed. I know this is possible because:

  • the browser does it
  • Postman does it
  • the django test client does it

How can I force requests to stream a particular file in the data?

blueFast
  • 41,341
  • 63
  • 198
  • 344
  • maybe this could help: https://stackoverflow.com/a/39217788/7505395 – Patrick Artner Mar 28 '20 at 09:24
  • and this might be a dupe: [https://stackoverflow.com/questions/2502596/python-http-post-a-large-file-with-streaming](https://stackoverflow.com/questions/2502596/python-http-post-a-large-file-with-streaming) - if so please dupeclose it yourself :) If I close it (and I am not sure) it gets dupe-hammered – Patrick Artner Mar 28 '20 at 09:24
  • @PatrickArtner thanks, but not the same. Those are talking either about non-requests solutions, or using the full data paratemer as the file to be streamed. My problem is different: the data parameter has several fields, and *one* of them is a file to be streamed. If I try this with `requests`, the file gets read. How can I avoid this? – blueFast Mar 28 '20 at 09:46

1 Answers1

0

Probably, this is no longer relevant, but I will share some information that I found in the documentation.

By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold the entire contents of the upload in memory. This means that saving the file involves only a read from memory and a write to disk and thus is very fast. However, if an uploaded file is too large, Django will write the uploaded file to a temporary file stored in your system’s temporary directory.

This way, there is no need to create a streaming file upload. Rather, the solution might be to handle (read) the loaded using a buffer.

Timur Kady
  • 11
  • 4