Wait until the webpage loads in Scrapy

Question

I am using scrapy script to load URL using "yield".

MyUrl = "www.example.com"
request = Request(MyUrl, callback=self.mydetail)
yield request
def mydetail(self, response):
    item['Description'] = response.xpath(".//table[@class='list']//text()").extract()
    return item

The URL seems to take minimum 5 seconds to load. So I want Scrapy to wait for some time to load the entire text in item['Description']. I tried "DOWNLOAD_DELAY" in settings.py but no use.

Scrapy downloads the whole response before running your callback. That load time you notice on your browser may be additional things fetched/rendered via javascript which scrapy does not do on it's own. Try doing `scrapy shell ` to see that scrapy "sees" on the site. You need to check what else the page fetches and modify your code to match that or use a headless browser to render the page's javascript. (e.g. Splash, Selenium) — marven, Feb 28 '15 at 02:47
I have used splash for rendering javascript. But the output is empty. I am not sure whether scrapy is rendering my javascript page — Prabhakar, Mar 14 '15 at 08:33
Regardless of if you use splash, what @marven said holds true, Scrapy will wait for the whole response before proceeding. If you use Splash, than Splash becomes the new "webserver". From Scrapy's point-of-view, Splash is it's endpoint and will wait until Splash returns the entirety of the response. — Rejected, Aug 25 '15 at 18:38
As is, you're callback is "self.mydetail", but the function is "jobdetail". Is this a typo? — Rejected, Aug 25 '15 at 18:41

score -1 · Accepted Answer · edited May 23 '17 at 12:02

-1

Make a brief view on firebug or another tool to capture responses for Ajax requests, which were made by javascript code. You are able to make a chain of responses to catch those ajax requests which appear after uploading of the page.There are several related questions: parse ajax content, retreive final page, parse dynamic content.

edited May 23 '17 at 12:02

Community

1
1

answered Aug 25 '15 at 09:42

yavalvas

330
2
17

Wait until the webpage loads in Scrapy

1 Answers1

Linked