I am trying to scrape the PAK value of different addresses in this website. It requires a settlement name and a street name as input. After researching on this topic, I have found a similar problem here. The structure of the payload is similar to my case, which both submit the post request with a onclick javascript:WebForm_DoPostBackWithOptions. I have adapted the solution and update the __VIEWSTATE, __VIEWSTATEGENERATOR and __EVENTVALIDATION value after requests.get(url). But it has been returning an error.
Screenshot of a successful request.
import requests
from bs4 import BeautifulSoup
link = 'https://www.posta.rs/eng/alati/pronadjite-pak.aspx'
h = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Content-Type': 'application/x-www-form-urlencoded',
'Cookie': 'ASP.NET_SessionId=s1h1n5jfkkrxhlptaoxznw32; _fbp=fb.1.1681200067437.833965447; _gid=GA1.2.2099936103.1681297769; _ga_XFLRYFZ3P4=GS1.1.1681457453.9.1.1681457479.34.0.0; _ga=GA1.2.1138011418.1681200067',
'Host': 'www.posta.rs',
'Origin': 'https://www.posta.rs',
'Referer': 'https://www.posta.rs/eng/alati/pronadjite-pak.aspx',
'sec-ch-ua': '"Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="11"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': 'Windows',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
}
payload = {
'__EVENTTARGET':'',
'__EVENTARGUMENT':'',
'__LASTFOCUS': '',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$PronadjiGroup': 'rbPronadjiPoAdresi',
'tNaselje': 'ZVONCE',
'tUlica': 'b',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$adresa1$tbBroj': '',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$adresa1$tbGrad': 'ZVONCE',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$adresa1$tbGradId':'',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$adresa1$tbUlica': 'b',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$adresa1$tbUlicaId':'',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$btnPronadji': 'Find',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$tbTekPak':'',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$tbVidljiv': 'none',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$tbIndex':'',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$tbTabela':'',
'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$tbTabClass':''
}
with requests.Session() as s:
s.headers = h
r = s.get(link)
soup = BeautifulSoup(r.text, "lxml")
payload['__VIEWSTATE'] = soup.select_one("#__VIEWSTATE")['value']
payload['__VIEWSTATEGENERATOR'] = soup.select_one("#__VIEWSTATEGENERATOR")['value']
payload['__EVENTVALIDATION'] = soup.select_one("#__EVENTVALIDATION")['value']
res = s.post(link, data=payload)
soup = BeautifulSoup(res.text, "lxml")
print(soup.find('table', id='cphMain_cphAlati_pronadjitepakusercontrol_pakpoadresiusercontrol_GVRezultatPaka'))
I have tried to change the __EVENTTARGET value to 'ctl00$ctl00$cphMain$cphAlati$pronadjitepakusercontrol$pakpoadresiusercontrol$btnPronadji', which I tried to 'click' the button to trigger the post back but I guess it is not how it should work. From my understanding, as long as I can provide the payload which is generated by clicking the 'Find' button, the server should be able to post back the request.