Is there a way of making Splash wait until all java-script on a site loads?

Asked Apr 25 '19 at 23:33

Active Apr 29 '19 at 17:25

Viewed 51 times

I'm scraping a site using scrapy and splash, it all works well but collects less data than in the target site even after making wait command of ten seconds, I have come to a conclusion this is due to some java-script not being fully loaded when the spider collects the response. It would be great is the spider could wait until all java-script loads since time may vary on data generated by the site

The recent trial was using a wait of ten seconds.

class TargetExSpider(scrapy.Spider):
    name = "tnmo_btex"
    start_urls = [
        'https://www.targetsite.com'
    ]

    def start_requests(self):
        for url in self.start_urls:
            yield SplashRequest(url=url, callback=self.parse, args={'wait': 10})

    def parse(self, response):
        rows = response.xpath(".//tr[@class='ng-skyscope']")
        ...

I would love it having splash to wait all Java-Script to load before collecting response

Thanks

edited Apr 29 '19 at 17:25

asked Apr 25 '19 at 23:33

Muhika Thomas

Is there a way of making Splash wait until all java-script on a site loads?

0 Answers0