How do I download a file using urllib.request in Python 3?

Question

So, I'm messing around with urllib.request in Python 3 and am wondering how to write the result of getting an internet file to a file on the local machine. I tried this:

g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png')
with open('test.png', 'b+w') as f:
    f.write(g)

But I got this error:

TypeError: 'HTTPResponse' does not support the buffer interface

What am I doing wrong?

NOTE: I have seen this question, but it's related to Python 2's urllib2 which was overhauled in Python 3.

possible duplicate of [Download file from web in Python 3](http://stackoverflow.com/questions/7243750/download-file-from-web-in-python-3) — kenorb, Jul 27 '15 at 22:41

score 11 · Accepted Answer · answered Apr 06 '13 at 01:32

11

change

f.write(g)

to

f.write(g.read())

answered Apr 06 '13 at 01:32

Sheng

3,467
1
17
21

Xantium · Answer 2 · 2017-08-22T22:37:11.210

7

An easier way I think (also you can do it in two lines) is to use:

import urllib.request
urllib.request.urlretrieve('http://media-mcw.cursecdn.com/3/3f/Beta.png', 'test.png')

As for the method you have used. When you use g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png') you are just fetching the file. You must use g.read(), g.readlines() or g.readline() to read it it.

It's just like reading a normal file (except for the syntax) and can be treated in a very similar way.

edited Aug 22 '17 at 22:37

answered Aug 21 '17 at 21:39

Xantium

11,201
10
62
89

The `PEP20` would have you use `Request` from `urllib.request` but yours would have a line less of code. [Information about PEP20 for Request](https://docs.python.org/3.2/library/urllib.request.html). You can use `open()` chained to `file.write(url.read())` like you mentioned. – Debug255 Feb 14 '18 at 02:36
@Debug255 Are you sure? The link mentioned `Open the URL url, which can be either a string or a Request object.`, here I specified a string so I don't think Request is required in this case. – Xantium Feb 14 '18 at 08:46
That worked on debian9 using python3.5. I don't use 2.7 too much. – Debug255 Mar 13 '18 at 04:27
1

This doesn't work if you have to get round the `403: Forbidden` issue using https://stackoverflow.com/a/16187955/563247 – Robert Johnstone Apr 29 '20 at 10:56
@Sevenearths That's true. However that's a different issue. Out of all the files I have used python to download/read, only a handful have ever given me a 403 error. I don't think this is a big enough reason not to warrent the use of `urlretrieve()`. Obviously if that issue is encounted, then what you have linked is the way forward – Xantium Apr 29 '20 at 11:54
Interesting how experiences differ. While writing my app the **first** url I tried `https://medium.com/@tomaspueyo/coronavirus-the-hammer-and-the-dance-be9337092b56` and it gave me the `403: Forbidden`. I wonder if it's just a Medium related issue – Robert Johnstone Apr 29 '20 at 12:46
1

@Sevenearths 403 is a Forbidden error. This usually happens when a website (server) attempts to block a bot. Or you try to access a webpage with incorrect login/cert information (usually cookie related from my experience, like passing outdated information, or similar). Seen as the solution you listed uses a user agent, it strongly looks like that site attepts to block bots (which makes sense since it's a news site) a user agent tricks the server into thinking it's a legitimate browser. – Xantium Apr 29 '20 at 13:36
@Sevenearths Personally I usually use dedicated APIs (and this sort of thing never comes up, as they expect bots), which is probably why I don't encounter the problem much. – Xantium Apr 29 '20 at 13:36

How do I download a file using urllib.request in Python 3?

2 Answers2

Linked