1

I have a large .mp4 file that I am trying to download from an API and write the contents to it. It runs a long time and then I get the ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read)). How can I make this work successfully and write the file successfully.

Here is my code:

if '.mp4' in assetFileName:
    myfile = requests.get(stepAssetURL, stream=True, auth=HTTPBasicAuth(UserID, UserPassword))
    destPath = path + assetFileName
    with open(destPath, 'wb') as d:
        for chunk in myfile.iter_content(chunk_size=1024):
            print(chunk)
            if chunk:
                d.write(chunk)
LibertyMan
  • 47
  • 6
  • How much of the file are you receiving before the error? Is the error in the final chunk of the file? What happens if you do not stream the file, but download it in one go? – mhawke Feb 02 '21 at 22:54
  • @mhawke. I'm not sure how much of the file I'm receiving. How do i tell that? If I don't stream the file I get a MemoryError. What is the solution for this? – LibertyMan Feb 02 '21 at 22:55
  • OK, if it's exhausting memory, you'll have to stream it. I was just curious to see whether it was possible to correctly receive the file. How to tell how much file was received? Download the file using another client, e.g. curl. and then compare the sizes of the file that you've created with the reference file that you downloaded with curl. This might lead nowhere, I am merely suggesting lines of investigation that might help you to discover a solution. – mhawke Feb 02 '21 at 23:00
  • Also, you could try using `chunk_size=None` to write the data in whatever size the chunks are received. I don't know that it will help but it's another variable to play with. – mhawke Feb 02 '21 at 23:04
  • @mhawke it looks like it is writing it but taking a long time to do it. I updated the code to chunk_size = 100000. It hasn't timed out yet. How much of a chunk_size should I give it? – LibertyMan Feb 02 '21 at 23:05
  • 1
    I'd just use `None` for the chunk size and let the library deal with it. It really shouldn't matter what size is used. Also, don't print the chunks - that will take extra time. It could be that the server is closing the connection because it times out between chunks. – mhawke Feb 02 '21 at 23:12
  • Okay thanks. Let me try that real quick. – LibertyMan Feb 02 '21 at 23:17
  • @mhawke, I changed the chunk_size to None and it is writing the file but it is taking a really long time. How can this be sped up? – LibertyMan Feb 02 '21 at 23:55
  • 1
    The bottleneck will probably be the speed of your network connection and the speed of the server. There's not much that you can do about that in your code. You could try downloading chunks of the file in parallel using multi-threading or multiprocessing, but you would still be limited by your bandwidth if the server is able to send the file quickly enough. Another option would be to spawn a download manager (e.g.[aria2](https://aria2.github.io/)) in a separate process that can perform multi-threaded downloads. Also look on PyPI for a Python package. – mhawke Feb 03 '21 at 00:11
  • @mhawke even after giving it time it got up to 600,000 KB and then gave me the ChunkedEncoding Error. What else can I do about this or is there anything I can about this? – LibertyMan Feb 03 '21 at 00:14
  • You might be able to use `urllib` to achieve streamed downloading, see https://stackoverflow.com/a/1517728/21945 and the following answer https://stackoverflow.com/a/5397438/21945 is also interesting. – mhawke Feb 03 '21 at 03:20
  • Or you could use `urllib3` which is used by `requests` under the hood. See the documentation section on [Streaming and I/O](https://urllib3.readthedocs.io/en/latest/advanced-usage.html?highlight=chunk#streaming-and-i-o) – mhawke Feb 03 '21 at 03:28
  • Hey @LibertyMan, did any of these suggestions work? – mhawke Feb 03 '21 at 23:24
  • Hi @mhawke. The whole issue was my speed of the network connection. Once I was able to connect to a better network I didn't need to even stream the file anymore it was able to be handled by just opening and writing the file. However, the chunk option did work as well successfully when connected to the better network. Thanks for all of your help! – LibertyMan Feb 05 '21 at 13:34

0 Answers0