I have a set of Web scrapers designed to run in Python 3.6 using Selenium ChromeDriver. All of them ran simply perfect.
This week I updated Selenium to v2.8 and ChromeDriver to v2.34.
Immediately, the scrapers failed to work normally and crash at some early point of the crawling.
I have a little implementation of sys.stdout
that outputs both to a .txt and the console, so I started noticing the errors are like this:
Message: no such frame
(Session info: chrome=63.0.3239.108)
(Driver info: chromedriver=2.34.522940
(1a76f96f66e3ca7b8e57d503b4dd3bccfba87af1),platform=Windows NT 10.0.15063 x86_64)
or
Message: no such element: Unable to locate element:
{"method":"name","selector":"txtClave"}
(Session info: chrome=63.0.3239.108)
(Driver info: chromedriver=2.34.522940
(1a76f96f66e3ca7b8e57d503b4dd3bccfba87af1),platform=Windows NT 10.0.15063 x86_64)
Message: no such element: Unable to locate element:
{"method":"xpath","selector":"//*[@id="ctl00_cp_wz_ddlTarjetas"]/option[2]"}
(Session info: chrome=63.0.3239.108)
(Driver info: chromedriver=2.34.522940
(1a76f96f66e3ca7b8e57d503b4dd3bccfba87af1),platform=Windows NT 10.0.15063 x86_64)
Message: no such element: Unable to locate element:
{"method":"xpath","selector":"//*[@id="ctl00_cp_wz_ddlTarjetas"]/option[3]"}
(Session info: chrome=63.0.3239.108)
(Driver info: chromedriver=2.34.522940
(1a76f96f66e3ca7b8e57d503b4dd3bccfba87af1),platform=Windows NT 10.0.15063 x86_64)
Message: no such element: Unable to locate element:
{"method":"xpath","selector":"//*[@id="ctl00_cp_wz_ddlTarjetas"]/option[4]"}
(Session info: chrome=63.0.3239.108)
(Driver info: chromedriver=2.34.522940
(1a76f96f66e3ca7b8e57d503b4dd3bccfba87af1),platform=Windows NT 10.0.15063 x86_64)
These are often followed by a ChromeDriver crash Windows message that was not there ever before: chromedriver.exe has stopped working
.
While looking at the Chrome window, and by debugging, I suspect the error is caused at lines that should make the spider wait for a page to load, but it doesn't wait so it fails at finding the elements.
Examples of the lines causing the errors:
self.driver.find_element_by_name('txtUsuario').send_keys(user + Keys.RETURN)
self.driver.find_element_by_name('txtClave').send_keys(passwd + Keys.RETURN)
...
self.driver.switch_to.default_content()
self.driver.switch_to_frame('Fmenu')
self.driver.find_element_by_xpath(XPATH_POSICIONGLOBAL).click()
Finding it too sad (and basically surrendering) to fail over adding explicit waits to EVERY element to interact with (maybe because I have more than a hundred?).
I hope someone helps me find out what might have caused a whole set of working spiders to fail crawling in these new versions of ChromeDriver / Selenium, and workaround a code-elegant, and easy to implement solution for this.
For example, I tried appending a implicitly_wait
to the WebDriver session, but it simply is not working at all.
def __init__(self):
self.driver = webdriver.Chrome(PATH_WEBDRIVER)
self.driver.implicitly_wait(10)
Finally, I used IDLE to run two of the failing spiders 1 Line at a time, and it works! So... why isn't it working at regular Spider execution?????
Many, many thanks in advance