I wish to download data for thousands of records from a government site using Python 2.7. One example of a record is http://camara.cl/pley/pley_detalle.aspx?prmID=1252&prmBL=1-07. Two related problems:
(1) the site relies on mouse clicks (in the source:
<a href="javascript:__doPostBack('ctl00$mainPlaceHolder$btnUrgencias','')">Urgencias</a>
to access another part of the data of interest to me; and
(2) I am illiterate in web scraping in general and Python in particular.
Learning-by-doing has so far taken me about half-way. Internet resources here, here, and here pushed me in the right direction. But I hit a wall.
I can get source code for the information that fills the screen when the url is invoked.
import requests
id = '1252'
bl = '1-07'
url = 'http://camara.cl/pley/pley_detalle.aspx'
parametros = {'prmID': id, 'prmBL': bl}
r = requests.get(url, params = parametros)
hitos = r.text
print hitos
But I've had no success in getting info from the 'Urgencias' tab. One attempt looks thus
import json
parametros = {'prmID': id, 'prmBL': bl, '__EVENTTARGET': 'ctl00$mainPlaceHolder$btnUrgencias'}
headers = {'content-type': 'application/x-www-form-urlencoded; charset=utf-8'}
p = requests.post(url, data = json.dumps(parametros), headers = headers)
urgencias = p.text
print urgencias
I am obviously not building/sending the request properly. (I am also missing cookies, I believe.)
Any help will be greatly appreciated. (Am open to use any method that will work from a Ubuntu machine!)