14

I use PhantomJS as my webdriver. Sometimes it takes too long to load a webpage but I don't know why

import time
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = 'Mozilla/5.0 (Windows NT 10.0;  WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'
driver = webdriver.PhantomJS(service_args=['--load-images=no'], desired_capabilities=dcap)
t=time.time()
driver.get('http://www.tibetculture.net/2012zyzy/zx/201509/t20150915_3939844.html')
print 'Time consuming:', time.time() - t

It took about 86s to load the page. In a browser, the webpage can be loaded in several seconds and I have no idea why webdriver PhantomJS takes such a long time. What's wrong with it?

SimmerChan
  • 173
  • 1
  • 2
  • 9

1 Answers1

26

There is a "pending" script running continuously. What I would do is to set the page load timeout, handle the TimeoutException by issuing window.stop():

from selenium.common.exceptions import TimeoutException

t = time.time()
driver.set_page_load_timeout(10)

try:
    driver.get('http://www.tibetculture.net/2012zyzy/zx/201509/t20150915_3939844.html')
except TimeoutException:
    driver.execute_script("window.stop();")
print('Time consuming:', time.time() - t)

print(driver.find_element_by_id("NewsTitle").text)

Prints the news title (proving that you can now locate elements and make actions on the page):

Time consuming: 10.590633869171143
让藏医药走出雪域高原
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • It really works and I can locate all the elements from my program. So, was the time wasted on downloading the 'pending' script? if there is a timeout, then stop downloading and rendering, just returning the web page that has been already rendered, right? I don't know whether I get the point. And thanks indeed! – SimmerChan Mar 30 '16 at 07:03
  • @SimmerChan yeah, that's the idea behind this solution - use `window.stop()` to stop pending requests. – alecxe Mar 30 '16 at 13:29
  • how to refresh after timeout exception? – Hamid Zandi Mar 24 '18 at 14:47