Given the following slightly pseudo code:
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver import ChromeOptions, Chrome
options = ChromeOptions()
driver = Chrome(options=options)
waiter = WebDriverWait(driver, 10)
list_of_urls = [<list_of_urls>]
for url in list_of_urls:
locator = (By.XPATH, "xpath_element_A")
element_A_condition = expected_conditions.presence_of_element_located(locator)
element_A = waiter.until(element_A_condition)
try:
locator = (By.XPATH, "xpath_sub_element_A")
sub_element_A_condition = expected_conditions.presence_of_element_located(locator)
sub_element_A = waiter.until(sub_element_A_condition)
except TimeoutException as e:
raise e
I'm finding that about 2-3% of the URLs I try to scrape are raising the TimeoutException
.
I've tried extending the wait time and I've even tried refreshing the page multiple times and attempting the entire page-scrape again - all to no avail.
To try and get to the bottom of this I put a breakpoint on the final line and ran the code in debugging mode. When the exception was raised and the break point hit I ran waiter.until(sub_element_A_condition)
again in the debug terminal and it immediately returned sub_element_A
.
I've now repeated this debugging process multiple times and the result is always the same - the TimeoutException
is raised and the break point hit but I'm able to immediately run waiter.until(sub_element_A_condition)
and it always returns the element.
This is most perplexing. The only thing I think I've done differently when the exceptions were raised was that I switched to the window (I run non-headless) to manually eyeball that the element was on the page. Could that be doing something that causes the element to become visible?