I am trying to log into a website with Scrapy, but the response received is an HTML document containing only inline JavaScript. The JS redirects to the page I want to scrape data from. But Scrapy does not execute the JS and therefore doesn't route to the page I want it to.
I use the following code to submit the login form required:
def parse(self, response):
request_id = response.css('input[name="request_id"]::attr(value)').extract_first()
data = {
'userid_placeholder': self.login_user,
'foilautofill': '',
'password': self.login_pass,
'request_id': request_id,
'username': self.login_user[1:]
}
yield scrapy.FormRequest(url='https://www1.up.ac.za/oam/server/auth_cred_submit', formdata=data,
callback=self.print_p)
The print_p callback function is as follows:
def print_p(self, response):
print(response.text)
I have looked at scrapy-splash but I could not find a way to execute the JS in the response with scrapy-splash.