I am trying to scrape some data from
http://www.pogdesign.co.uk/cat/
.
I want to get the channel and the air-time of each program, but the problem is that by default they do not appear. Only after manually configuring the settings and saving them, the channel and the air-time of each program appear.
As I understand after inspecting the 'Network' section in the Chrome's developer tools, what actually happens after I click 'Save Settings' is that a POST request is being sent, with the relevant data parameters (e.g. 's_networks':'on'
and etc'), then a GET request is being sent, to retrieve the html file with channel and the air-time displayed.
I tried to emulate this process (POST request then GET request) using both
the python's requests
package, and the mechanicalsoup
package.
requests:
s = requests.Session()
s.post('http://www.pogdesign.co.uk/cat/', data = {'s_networks':'on'})
s.get('http://www.pogdesign.co.uk/cat/')
mechanicalsoup:
mcs = mechanicalsoup.Browser()
res_post = mcs.post('http://www.pogdesign.co.uk/cat/', data {'s_networks':'on'})
res_get = mcs.get('http://www.pogdesign.co.uk/cat/')
Yet the response I receive does not contain the channel and the air-time data.
The only difference I noticed is that the status code returned from the browser's POST request is 302
, and the returned status code from my python requests is 200
.