4

I'm using Python 3.8 on Windows 10 using the requests module. As the title suggests, I am posting very large files to an HTTP server and I want to provide the status of the upload.

I've read 10-20 Stack Overflow threads on this topic, read articles sprinkled around on the internet, and dived into source code of projects on GitHub I can't even remember at this point. I tried to implement everything that I read to no avail. A lot of the information regarding this topic is years old now and the requests module has been vastly improved since - so some info could be outdated.

The issue that I am experiencing is sending the file chunks using requests.Response.post('Some-URL', data=file_chunk, header=header). If there are 5 file chunks uploaded, then on the server there are 5 separate files instead of 1 combined file.

In order to provide the status of the file upload, I created a generator function similar to the example shown below.

def read_in_chunks(file_object, chunk_size=1024):
    """Generator to read a file piece by piece.
    Default chunk size: 1k."""
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data

I then iterated over the generator object that was created like this:

with open('Some-File.zip', 'rb') as file_obj:
    for file_chunk in read_in_chunks(file_obj):
        requests.Response.post('Some-URL', data=file_chunk.encode('utf-8'), header=header)

This does NOT work. On the server where the file chunks are uploaded to, each chunk is stored on the server as a separate file. If the file is broken into 5 chunks, then there are now 5 files. In the requests documentation, it says you can pass a generator function to the data= parameter, though I cannot get that to work. The docs also say to iterate over the data using Response.iter_content(), but I do not know what this exactly means or how to implement it. Documentation seems a little sparse on this topic.

I've also tried using requests-toolbelt following this code here. The code pretty much is identical to examples in the documentation. I experienced the same issue described above. I also create an SHA-256 hash of the file before I upload it and the hash would change every single time I executed the script before the upload started... no idea so I stopped using this method.

There is a chance that on the server side, chunk file uploading is not supported which is my thought if I am implementing this correctly.

probat
  • 1,422
  • 3
  • 17
  • 33
  • What version of Python and requests are you using? Did you mean to say "requests.Request" rather than "requests.Response"? Invoking POST on a response doesn't make much sense. Why are you encoding the data into UTF-8? What is the content of your headers? All that aside, I think the simplest way to get what you want is to wrap your file object. See this [Stack Overflow question](https://stackoverflow.com/questions/13909900/progress-of-python-requests-post) – BS Labs Apr 21 '20 at 17:39
  • I came across that thread before posting this and i was hoping for a more elegant answer from the community. After you suggested this, I revisited it and did get that to work in my project. It would be nice if there were a more simple approach, but I do not think there is. – probat Apr 24 '20 at 23:38
  • 4
    @probat, could you post your final solution please ? – Teddy Kossoko Jan 11 '21 at 14:53

0 Answers0