I am attempting to scrape the following website flow.gassco.no as one of my first python projects. I need to bypass the splash screen which redirects to the main page. I have isolated the following action,
<form method="get" action="acceptDisclaimer">
<input type="submit" value="Accept"/>
<input type="button" name="decline" value="Decline" onclick="window.location = 'http://www.gassco.no'" />
</form>
In a browser appending 'acceptDisclaimer?' to the url redirects to the target flow.gassco.no. However if I try to replicate this in urllib, I appear to stay on the same page when outputting the source.
import urllib, urllib2
url="http://flow.gassco.no/acceptDisclaimer?"
url2="http://flow.gassco.no/"
#first pass to invoke disclaimer
req=urllib2.Request(url)
res=urllib2.urlopen(req)
#second pass to access main page
req1=urllib2.Request(url2)
res2=urllib2.urlopen(req1)
data=res2.read()
print data
I suspect that I have oversimplified the problem, but would appreciate any input into how I can accept the disclaimer and continue to output the main page source.