1

I have pages with ads that display phone numbers after clicking on an item, but all these pages can have different formats and iterate over them all for a very long time. Page example.

I try to find a clickable element on the page, save it and the parent element, click on the element and then find it through the parent element, but I can’t do it:

>>> phone = driver.find_element_by_xpath('.//a[contains(@class, "link-phone")]')
>>> phone.get_attribute('innerHTML')
'\n                    <span class="glyphicon glyphicon-phone"></span>Показать телефон'
>>> phone_elem = phone.find_element_by_xpath('..')
>>> phone_elem.get_attribute('innerHTML')
'<a class="link-phone nowrap js-get-phone" href="javascript:void(0);">\n                    <span class="glyphicon glyphicon-phone"></span>Показать телефон</a> '
>>> ActionChains(driver).move_to_element(phone).perform()
>>> sleep(0.5)
>>> phone.click()
>>> sleep(1.5)
>>> phone_elem.get_attribute('innerHTML')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python37\lib\site-packages\selenium\webdriver\remote\webelement.py", line 141, in get_attribute
    self, name)
  File "C:\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 636, in execute_script
    'args': converted_args})['value']
  File "C:\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
  (Session info: chrome=79.0.3945.56)

>>>                                                                                                                                                                                              

Most likely, after clicking on an element the whole page changes. What can I do?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
kshnkvn
  • 876
  • 2
  • 18
  • 31

2 Answers2

1

To extract the phone numbers you need to scrollIntoView the element first and then induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategies:

  • Code Block:

    driver.get('https://www.work.ua/ru/jobs/3385738/')
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p/b[text()='Условия:']"))))
    WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='link-phone nowrap js-get-phone']"))).click()
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//p/b[text()='Для связи с нами обращайтесь по номеру:']//following::p/b[contains(., '—')]")))])
    driver.quit()
    
  • Console Output:

    ['+380 (93) 908 — 53 — 66 ', '+380 (93) — 103 — 19 — 77 ']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

After click phone, you can use the bellow xpath to get innerHTML:

//div[@id="job-description"]//p//b[contains(.,"+")]//ancestor::p

Use .find_elements* and extract them with iteration:

ActionChains(driver).move_to_element(phone).perform()
time.sleep(1)
phone.click()
time.sleep(1)
elements = driver.find_elements_by_xpath('//div[@id="job-description"]//p//b[contains(.,"+")]//ancestor::p')

for element in elements:
    print(element.get_attribute('innerHTML'))
frianH
  • 7,295
  • 6
  • 20
  • 45