0
  1. I have been playing around with windmill to try out some web scraping, however the API waits.forPageLoad is not able to check if the page is fully rendered.

  2. And in a scenario where I need to reload a page with an existing DOM and I use waits.forElement to detect the DOM for the script to "decide" that the page has loaded. This would occasionally detect the DOM even before the page has loaded.

  3. Also loading a page with windmill test client in firefox seems to take forever. The same page if I load with my regular firefox browser may take like 2 seconds but may take up to a minute in the test client. Is it normal for it to take so long?

  4. Lastly I was wondering if there are better alternatives to windmill for webscraping? The documentation seems abit sparse.

Please advice. Thanks :P

Renl
  • 291
  • 1
  • 3
  • 4
  • How do you define "the page has loaded" for pages with AJAX requests? – jfs Jan 21 '12 at 07:02
  • the 3rd point: clear cache in your regular firefox browser and try to load the page. How long does it take? – jfs Jan 21 '12 at 07:04
  • [Selenium Webdriver](http://seleniumhq.org/docs/03_webdriver.html) could be used as an alternative but it uses ['Use The Source Luke!'](http://selenium.googlecode.com/svn/trunk/docs/api/py/index.html#use-the-source-luke) approach for documentation. – jfs Jan 21 '12 at 07:07
  • I cleared my cache and it still does not load as slow as the test client. hrmm. – Renl Jan 21 '12 at 07:18
  • for me the page has loaded as long as the DOM are loaded and things like comboboxes has been populated. I'm not sure what pages with AJAX requests mean. – Renl Jan 21 '12 at 07:23
  • e.g, javascript on your page can perform an asynchronous HTTP request and change the DOM using received data. – jfs Jan 21 '12 at 07:36

1 Answers1

0
 client.waits.sleep(milliseconds=u'2000')

an absolute pause of 2 seconds.

 client.waits.forPageLoad(timeout=u'20000')

Will wait on future lines until the page loads or until 20 seconds have elapsed, which ever comer first. Think of it as a time bordered assert. If the page loads in under 20 seconds pass, if not fail.

I hope this helps,

TD

TangibleDream
  • 601
  • 8
  • 29