Alternative of urllib.urlretrieve in Python 3.5

Question

I am currently doing a course on machine learning in UDACITY . In there they have written some code in python 2.7 but as i am currently using python 3.5 , i am getting some error . This is the code

import urllib
url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
urllib.urlretrieve(url, filename="../enron_mail_20150507.tgz")
print ("download complete!")

I tried urllib.request .

  import urllib
  url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
  urllib.request(url, filename="../enron_mail_20150507.tgz")
  print ("download complete!")

But still gives me error .

urllib.request(url, filename="../enron_mail_20150507.tgz")
TypeError: 'module' object is not callable

I am using PyCharm as my IDE .

Someone just answerd it and worked for me . instead of urllib.urlretrive one would have to use urllib.request.urlretrieve — CodeHead, Jul 13 '16 at 17:59

mgilson · Accepted Answer · 2016-07-13T18:00:05.477

You'd use urllib.request.urlretrieve. Note that this function "may become deprecated at some point in the future", so you might be better off using the less likely to be deprecated interface:

# Adapted from the source:
# https://hg.python.org/cpython/file/3.5/Lib/urllib/request.py#l170
with open(filename, 'wb') as out_file:
    with contextlib.closing(urllib.request.urlopen(url)) as fp:
        block_size = 1024 * 8
        while True:
            block = fp.read(block_size)
            if not block:
                break
            out_file.write(block)

For small enough files, you could just read and write the whole thing and drop the loop entirely.

this is what I needed, but my god is it ugly. – Michael Altfield Aug 16 '20 at 19:54 — Michael Altfield, Aug 16 '20 at 19:54

score 4 · Answer 2 · edited Dec 02 '21 at 13:45

4

You can use shutil.copyfileobj() to magically copy from the url bytestream to the file.

import urllib.request
import shutil

url = "http://www.somewebsite.com/something.pdf"
output_file = "save_this_name.pdf"
with urllib.request.urlopen(url) as response, open(output_file, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)

Source: https://stackoverflow.com/a/48691447/1174102

edited Dec 02 '21 at 13:45

Socowi

25,550
3
32
54

answered Aug 16 '20 at 20:31

Michael Altfield

2,083
23
39

score 2 · Answer 3 · answered Mar 24 '19 at 11:52

I know this question has long been answered but I'll contribute for any future viewer.

The proposed solution is good but the main issue if that it can generate empty files if you are using invalid urls.

As a workaround to this problem here is how I adapted the code:

def getfile(url,filename,timeout=45):
    with contextlib.closing(urlopen(url,timeout=timeout)) as fp:
        block_size = 1024 * 8
        block = fp.read(block_size)
        if block:
            with open(filename,'wb') as out_file:
                out_file.write(block)
                while True:
                    block = fp.read(block_size)
                    if not block:
                        break
                    out_file.write(block)
        else:
            raise Exception ('nonexisting file or connection error')

I hope this help.

Alternative of urllib.urlretrieve in Python 3.5

3 Answers3

Linked