How can i get a web page that delays in python?

Question

I try to do a web crawler,so first step is to analyze the web page. I use the urllib2.urlopen("url") to get the web page.But the web page needs loading for a while because of many js and so on.So everytime i get part of the web page.It stops me. Could anyone give me some advice.

use some sort of headless browser, or browser driver, beucase urllib does not execute JS for you — dm03514, Mar 10 '14 at 13:52
http://stackoverflow.com/questions/8550114/can-scrapy-be-used-to-scrape-dynamic-content-from-websites-that-are-using-ajax, http://stackoverflow.com/questions/10647741/executing-javascript-functions-using-scrapy-in-python, https://github.com/scrapinghub/scrapyjs, http://jackliusr.blogspot.com/2013/11/scrapy-to-crawl-dynamic-contents.html — dm03514, Mar 10 '14 at 14:04
I answered a similar question a while back, [have a look](https://stackoverflow.com/questions/22028775/tried-python-beautifulsoup-and-phantom-js-still-cant-scrape-websites/22030553#22030553) at it. — Steinar Lima, Mar 10 '14 at 14:30

score 0 · Answer 1 · answered Mar 10 '14 at 13:59

0

You can try PyExecJS if you want to execute js code in python. But usually running client's side is too costly for simple crawler.

answered Mar 10 '14 at 13:59

user3401858

86
5

How can i get a web page that delays in python?

1 Answers1