How do I check the HTTP status code of an object without downloading it?

Question

>>> a=urllib.urlopen('http://www.domain.com/bigvideo.avi')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

My question is...bigvideo.avi is 500MB. Does my script first download the file, then check it? Or, can it immediately check the error code without saving the file?

score 18 · Accepted Answer · answered Nov 13 '09 at 19:33

You want to actually tell the server not to send the full content of the file. HTTP has a mechanism for this called "HEAD" that is an alternative to "GET". It works the same way, but the server only sends you the headers, none of the actual content.

That'll save at least one of you bandwidth, while simply not doing a read() will only not bother getting the full file.

Try this:

import httplib
c = httplib.HTTPConnection(<hostname>)
c.request("HEAD", <url>)
print c.getresponse().status

The status code will be printed. Url should only be a segment, like "/foo" and hostname should be like, "www.example.com".

in py3k it's `http.client` instead of `httplib` and the rest is exactly the same. — SilentGhost, Nov 13 '09 at 20:01

score 1 · Answer 2 · edited May 23 '17 at 11:59

1

Yes, it will fetch the file.

I think what you really want to do is send a HTTP HEAD request (which basically asks the server not for the data itself, but for the headers only). you can look here.

edited May 23 '17 at 11:59

Community

1
1

answered Nov 13 '09 at 19:30

Ofri Raviv

24,375
3
55
55

Corey Goldberg · Answer 3 · 2009-11-13T19:34:34.360

0

i think your code already does that. you never call the read() method on the response, so you are never actually downloading the file's contents.

better yet... you could send an HTTP HEAD request using httplib instead of doing the HTTP GET that your urllib code does.

edited Nov 13 '09 at 19:34

answered Nov 13 '09 at 19:28

Corey Goldberg

59,062
28
129
143

So that means...if I were to check the status code of a 500gigabyte file..it would only take a second? – TIMEX Nov 13 '09 at 19:30
1

That's not entirely true. Because urllib sent a full request to the server, the server will start dumping it, even if it doesn't get all the way to the app. – Ken Kinder Nov 13 '09 at 19:31
1

Ken, I know what you mean, but his questions was how to do it without downloading the file. and in this case, no content is read by the client after the response header – Corey Goldberg Nov 13 '09 at 19:33
@corey: It might still block, and you're wasting bandwidth. – Jed Smith Nov 13 '09 at 19:35
1

That's true, but what he really wants is HEAD, which won't waste bandwidth on either side. – Ken Kinder Nov 13 '09 at 19:35

How do I check the HTTP status code of an object without downloading it?

3 Answers3

Linked