0

I'm trying to do this on remote Ubuntu Server:

>>> import urllib2, requests
>>> url = 'http://python.org/'
>>> urllib2.urlopen(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

>>> requests.get(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 382, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 505, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/home/django/zyq2/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 99, in resolve_redir  ts
    raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

But it works fine on local Windows machine:

>>> urllib2.urlopen(url)
<addinfourl at 57470168 whose fp = <socket._fileobject object at 0x036CB630>>
>>> requests.get(url)
<Response [200]>

I have absolutely no idea about what's going on and would appreciate any suggestion.

Update

I tried S.M. Al Mamun's suggestion and got an exception with long traceback:

>>> req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
>>> urllib2.urlopen(req).read()
...
long traceback (more than one page)
...
urllib2.HTTPError: HTTP Error 303: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
See Other

Infinite loop again (I mean TooManyRedirects exception).

Vlad T.
  • 2,568
  • 3
  • 26
  • 40

1 Answers1

0

Try using a user-agent:

req = urllib2.Request(url, headers={ 'User-Agent': 'Mozilla/5.0' })
urllib2.urlopen(req).read()

If it doesn't work still, that might be your Ubuntu is offline!

Sharif Mamun
  • 3,508
  • 5
  • 32
  • 51
  • Hi, thank you for reply, but it doesn't work, I updated the question with your suggestion. What do you mean when you say Ubuntu is offline? It is a production server running several sites. – Vlad T. Apr 23 '14 at 06:25
  • Try to use the accepted answer from this post: http://stackoverflow.com/questions/9926023/handling-rss-redirects-with-python-urllib2 – Sharif Mamun Apr 23 '14 at 06:54
  • It throws exception `TypeError: unbound method add_cookie_header() must be called with CookieJar instance as first argument (got Request instance instead)`. BTW, it must work without cookies as shown in Python docs (first example) https://docs.python.org/2.7/howto/urllib2.html#fetching-urls – Vlad T. Apr 23 '14 at 07:45