0

I'm trying to write a small program that will simply display the header information of a website. Here is the code:

import urllib2

url = 'http://some.ip.add.ress/'

request = urllib2.Request(url)

try:
    html = urllib2.urlopen(request)
except urllib2.URLError, e:
    print e.code
else:
    print html.info()

If 'some.ip.add.ress' is google.com then the header information is returned without a problem. However if it's an ip address that requires basic authentication before access then it returns a 401. Is there a way to get header (or any other) information without authentication?


I've worked it out.

After try has failed due to unauthorized access the following modification will print the header information:

print e.info()

instead of:

print e.code()

Thanks for looking :)

cookertron
  • 188
  • 12

2 Answers2

1

If you want just the headers, instead of using urllib2, you should go lower level and use httplib

import httplib
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
print conn.getresponse().getheaders()
vartec
  • 131,205
  • 36
  • 218
  • 244
  • I tired that before and received a 401. Your method works great! Thanks for that – cookertron Apr 17 '12 at 16:07
  • I seem to be having problems again, the python console is returning _[Errno 104] Connection reset by peer_ and i'm not sure what this means. I'm the peer but i'm not resetting anything, am i!? – cookertron Apr 17 '12 at 16:31
  • Most likely means, that the other end is not accepting HTTP connections. – vartec Apr 17 '12 at 16:38
  • You'd think so but i can access and log in via my chrome browser just fine. Also my orginal code (at the top) works fine and returns the headers no problem! Straaaange – cookertron Apr 17 '12 at 16:48
  • Instead of going low level I think it would be better to show how to do `HEAD` request using urllib2. – Piotr Dobrogost Apr 17 '12 at 18:30
0

If all you want are HTTP headers then you should make HEAD not GET request. You can see how to do this by reading Python - HEAD request with urllib2.

Community
  • 1
  • 1
Piotr Dobrogost
  • 41,292
  • 40
  • 236
  • 366