urllib2 gives HTTP Error 400: Bad Request for certain urls, works for others

Question

I'm trying to do a simple HTTP get request with Python's urllib2 module. It works sometimes, but other times I get HTTP Error 400: Bad Request. I know it's not an issue with the URL, because if I use urllib and simply do urllib.urlopen(url) it works fine - but when I add headers and do urllib2.urlopen() I get Bad Request on certain sites.

Here is the code that's not working:

# -*- coding: utf-8 -*-
import re,sys,urllib,urllib2

url = "http://www.gamestop.com/"

headers = {'User-Agent:':'Mozilla/5.0'}

req = urllib2.Request(url,None,headers)
response = urllib2.urlopen(req,None)
html1 = response.read()

(gamestop.com is an example of a URL that does not work)

Some different sites work, some don't, so I'm not sure what I'm doing wrong here. Am I missing some important headers? Making the request incorrectly? Using the wrong User-Agent? (I also tried using the exact User-Agent of my browser, and that didn't fix anything)

Thanks!

score 8 · Answer 1 · answered Jun 12 '11 at 04:06

8

You've got an extra colon in your headers.

headers = { 'User-Agent:': 'Mozilla/5.0' }

Should be:

headers = { 'User-Agent': 'Mozilla/5.0' }

answered Jun 12 '11 at 04:06

Dietrich Epp

205,541
37
345
415

Wow. Yup, that was the issue. Thanks! – Tom Jun 12 '11 at 04:25
4

@Tom: Welcome to StackOverflow. "Wow. Yup, that was the issue. Thanks!" is best expressed "[by clicking on the check box outline to the left of the answer](http://stackoverflow.com/faq#howtoask)". – johnsyweb Jun 12 '11 at 05:41

urllib2 gives HTTP Error 400: Bad Request for certain urls, works for others

1 Answers1

Linked