3

I am trying to fetch an html site using Python urlopen.
I am getting this error:

HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop

The code:

from urllib2 import Request
request = Request(url)
response = urlopen(request)

I understand that the server redirects to another URL and that it is looking for a cookie.
How do I set the cookie it is looking for so I can read the html?

Dzinx
  • 55,586
  • 10
  • 60
  • 78
yossi
  • 12,945
  • 28
  • 84
  • 110
  • 2
    You have a web page that redirects to another web page which redirects to the first one (loop). – Mariusz Jamro Feb 02 '12 at 14:06
  • yeah i know that, i am looking for a way to get around this – yossi Feb 02 '12 at 14:10
  • 2
    Drop the link in a [Redirect Checker](http://www.internetofficer.com/seo-tool/redirect-check/). See what it comes up with. And does this work in a browser? What about a browser running in private/incognito mode with data cleared? – Brigand Feb 02 '12 at 14:10
  • i checked with Redirect Checker - it redirects to himself. it does work in a browser because the browser supports cookies – yossi Feb 02 '12 at 14:36

1 Answers1

7

Here's an example from Python documentation, adjusted to your code:

import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
request = urllib2.Request(url)
response = opener.open(request)
Dzinx
  • 55,586
  • 10
  • 60
  • 78
  • Worked perfectly. Was messing around with Selenium and Spynner, but really it seems that I can use urllib2 for everything, even with cookies and form data and redirects and session ids that expire. – user984003 Mar 21 '13 at 14:19