1

I'm trying to do a simple HTTP get request with Python's urllib2 module. It works sometimes, but other times I get HTTP Error 400: Bad Request. I know it's not an issue with the URL, because if I use urllib and simply do urllib.urlopen(url) it works fine - but when I add headers and do urllib2.urlopen() I get Bad Request on certain sites.

Here is the code that's not working:

# -*- coding: utf-8 -*-
import re,sys,urllib,urllib2

url = "http://www.gamestop.com/"

headers = {'User-Agent:':'Mozilla/5.0'}

req = urllib2.Request(url,None,headers)
response = urllib2.urlopen(req,None)
html1 = response.read()

(gamestop.com is an example of a URL that does not work)

Some different sites work, some don't, so I'm not sure what I'm doing wrong here. Am I missing some important headers? Making the request incorrectly? Using the wrong User-Agent? (I also tried using the exact User-Agent of my browser, and that didn't fix anything)

Thanks!

insumity
  • 5,311
  • 8
  • 36
  • 64
Tom
  • 11
  • 1
  • 2

1 Answers1

8

You've got an extra colon in your headers.

headers = { 'User-Agent:': 'Mozilla/5.0' }

Should be:

headers = { 'User-Agent': 'Mozilla/5.0' }
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Wow. Yup, that was the issue. Thanks! – Tom Jun 12 '11 at 04:25
  • 4
    @Tom: Welcome to StackOverflow. "Wow. Yup, that was the issue. Thanks!" is best expressed "[by clicking on the check box outline to the left of the answer](http://stackoverflow.com/faq#howtoask)". – johnsyweb Jun 12 '11 at 05:41