1

Possible Duplicate:
Python’s urllib2 doesn’t work on some sites

Ok, I just want to access this URL using python: http://www.gocomics.com/wizardofid/2013/01/22

But, whenever I call urllib2.urlopen('http://www.gocomics.com/wizardofid/2013/01/22').read(), it gives me a 403 error. With urllib, all I can do is read the error page, but urllib2 raises the error. When I look at the page in Chrome, it doesn't give me any problems. Why is this, and how can I fix it? Thanks!

Community
  • 1
  • 1
Tom
  • 846
  • 5
  • 18
  • 30
  • No, it is not a duplicate. I tried using a user agent, it didn't work. – Tom Jan 23 '13 at 02:05
  • @SimpleCoder basically, all i did was urllib2.urlopen('http://www.gocomics.com/wizardofid/2013/01/22').read() – Tom Jan 23 '13 at 02:05
  • @SimpleCoder and urllib.urlopen('http://www.gocomics.com/wizardofid/2013/01/22').read() – Tom Jan 23 '13 at 02:05

1 Answers1

3

This particular website requires a "browser-like" User-Agent header, otherwise it will deny access.

Try adding a header, like (for instance) this:

import urllib2

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
print urllib2.urlopen('http://gocomics.com/wizardofid/2013/01/22').read()
favoretti
  • 29,299
  • 4
  • 48
  • 61