0

I've already tried using JSON but can't really read this page.

This is my python code. I've tried it on other websites and it works, but on this website it returns a 403.

import urllib2

req = urllib2.Request('http://www.taringa.net/envivo/ajax.php')
response = urllib2.urlopen(req)
the_page = response.read()

print the_page
ssundarraj
  • 809
  • 7
  • 16

2 Answers2

1

Better use requests. I tried your script and got the status of 403. This means that access to it is closed, for whatever reason, I do not know.

i.krivosheev
  • 387
  • 3
  • 18
  • it's what i was thinking... but i have to install the library in win7 python and im thinking how to do it... but thanks i'll try... the xtrange thing is that if u enter to the page u can see all the ajax responses – user2096328 Jun 09 '15 at 07:58
  • on windows install pip: http://stackoverflow.com/questions/4750806/how-to-install-pip-on-windows – i.krivosheev Jun 09 '15 at 08:07
  • i have installed pip and package now says 403 forbiden... this is the scriptimport requests r = requests.get('http://www.taringa.net/envivo/ajax.php', stream=True) for line in r.iter_lines(): if line: print line – user2096328 Jun 09 '15 at 08:24
  • 1
    Library will not solve your problems. It will only help to write the code easier. Why returns 403 status, I do not know. Perhaps you need to put some cookies and headers ... – i.krivosheev Jun 09 '15 at 08:26
0

You have to add the 'User-Agent' header in order to make this work.

Urllib code:

req = urllib2.Request('http://www.taringa.net/envivo/ajax.php')
req.add_header('User-Agent', 'Mozilla')
resp = urllib2.urlopen(req)
print resp.code  # Gives 200.
print resp.read()  # Gives the HTML of the page.

I would recommend that you use requests mainly because it makes this kind of stuff very easy.

Requests code:

h = {'User-Agent':'Mozilla'}
requests.get('http://www.taringa.net/envivo/ajax.php', headers=h)
ssundarraj
  • 809
  • 7
  • 16