Execute inline JavaScript in Scrapy response

Question

I am trying to log into a website with Scrapy, but the response received is an HTML document containing only inline JavaScript. The JS redirects to the page I want to scrape data from. But Scrapy does not execute the JS and therefore doesn't route to the page I want it to.

I use the following code to submit the login form required:

    def parse(self, response):
      request_id =   response.css('input[name="request_id"]::attr(value)').extract_first()
      data = {
          'userid_placeholder': self.login_user,
          'foilautofill': '',
          'password': self.login_pass,
          'request_id': request_id,
          'username': self.login_user[1:]
      }
      yield   scrapy.FormRequest(url='https://www1.up.ac.za/oam/server/auth_cred_submit',   formdata=data,
                               callback=self.print_p)

The print_p callback function is as follows:

def print_p(self, response):
    print(response.text)

I have looked at scrapy-splash but I could not find a way to execute the JS in the response with scrapy-splash.

have you tried to manually go to the page the JS redirections bring you to ? (that's to say, scrap an url in `print_p` and yield a request to this page) — Pablo, Jun 22 '17 at 10:30
https://docs.scrapy.org/en/latest/topics/dynamic-content.html — Gallaecio, Nov 20 '19 at 09:42

score 5 · Answer 1 · answered Jun 22 '17 at 12:19

5

I'd suggest using Splash as a rendering service. Personally, I found it more reliable than Selenium. Using scripts, you can instruct it to interact with the page.

answered Jun 22 '17 at 12:19

Tomáš Linhart

9,832
1
27
39

alexxmagpie · Answer 2 · 2017-12-05T09:39:14.060

2

Probably selenium can help you pass this JS.

If you haven't checked it yet you can use some examples like this. If you'll have luck to reach it then you can get page url with:

self.driver.current_url

And scrape it after.

edited Dec 05 '17 at 09:39

answered Jun 22 '17 at 11:18

alexxmagpie

367
2
10

Execute inline JavaScript in Scrapy response

2 Answers2