0

I'm trying to grab content that loads dynamically using selenium via python3 after page load. I tried every solution I could find here but none of them works.

Specifically what I need is the value of href, but for now just being able to retrieve the entire page source with all the content after page-load would work as well.

Example of href value I need:

<a class="class1 class2" href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">

I tried the following:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))

Which errors with this:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".Image-image")))
    File "C:\Users\Mike\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until
    raise TimeoutException(message, screen, stacktrace)

If I remove that and try this:

page_source = driver.page_source
driver.close()
with open(r"output.txt", "w") as f:        
   f.write(page_source)

Then I just get the loaded HTML page.

Additional configurations I am using that may be helpful in finding a solution:

s = Service("chromedriver.exe")
options = Options()
options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) ''Chrome/94.0.4606.81 Safari/537.36')
driver = webdriver.Chrome(options=options,service=s)

Any direction would be greatly appreciated!

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Awsmike
  • 751
  • 1
  • 7
  • 19
  • If you use the `WebDriverWait(driver, 20)` then it's highly likely that one would get `TimeoutException`, Can you try with `print(driver.find_element(By.CSS_SELECTOR, "a.Image-image").get_attribute('href'))` and let us know what's the exact error ? – cruisepandey Apr 05 '22 at 18:11
  • selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"a.Image-image"} – Awsmike Apr 05 '22 at 19:31

2 Answers2

0

As per the HTML:

<a class="class1 class2" href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">

Your locator strategy technically seems perfecto:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))

However, simply using the values of the class attribute may not identify the element uniquely within the HTML DOM. In such cases you may require to construct a more canonical locator by adding the <tag_name> as well as the partial static value of the href attribute as follows:

print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href]"))).get_attribute("href"))

more canonically

print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href*='path']"))).get_attribute("href"))
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

to address the below issue:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"a.Image-image"}

Please check in the dev tools (Google chrome) if we have unique entry in HTML-DOM or not.

xpath that you should check :

//a[@class='Image-image']

Steps to check:

Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.

If this is unique //a[@class='Image-image'] then you need to check for the below conditions as well.

  1. Check if it's in any iframe/frame/frameset.

    Solution: switch to iframe/frame/frameset first and then interact with this web element.

  2. Check if it's in any shadow-root.

    Solution: Use driver.execute_script('return document.querySelector to have returned a web element and then operates accordingly.

  3. Make sure that the element is rendered properly before interacting with it. Put some hardcoded delay or Explicit wait and try again.

    Solution: time.sleep(5) or

    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[@class='Image-image']"))).get_attribute("href")

  4. If you have redirected to a new tab/ or new windows and you have not switched to that particular new tab/new window, otherwise you will likely get NoSuchElement exception.

    Solution: switch to the relevant window/tab first.

  5. If you have switched to an iframe and the new desired element is not in the same iframe context then first switch to default content and then interact with it.

    Solution: switch to default content and then switch to respective iframe.

cruisepandey
  • 28,520
  • 6
  • 20
  • 38
  • Not sure if this is unique or not: //*[@id="main"]/div/div/div[3]/div/div/div/div[3]/div[3]/div[2]/div/div/div[9]/div/article/a – Awsmike Apr 06 '22 at 14:24
  • I have mentioned the steps to check if it's unique or not. Please read `Steps to check:` section above – cruisepandey Apr 06 '22 at 14:36
  • Yes - I wasn't sure if the slashes after the first " ] " meant it was or wasn't unique, but I checked all 5 conditions and still get the same error (with the exception of using WebDriverWait) which TimesOut – Awsmike Apr 06 '22 at 15:00
  • Not sure if I got you correctly or not. When you looked for `//a[@class='Image-image']` XPath how many entry you see in HTMLDOM? – cruisepandey Apr 06 '22 at 17:12
  • ('Image-image' was just an example I used to simplify my question, but this is the Xpath //*[@id="main"]/div/div/div[3]/div/div/div/div[3]/div[3]/div[2]/div/div/div[9]/div/article/a this last "a" element is the one I need – Awsmike Apr 06 '22 at 17:17
  • this is a horrible xpath `//*[@id="main"]/div/div/div[3]/div/div/div/div[3]/div[3]/div[2]/div/div/div[9]/div/article/a`. Also if `'Image-image` was just for an example, why it is present in error stack trace `WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".Image-image"))) File "C:\Users\Mike\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until raise TimeoutException(message, screen, stacktrace)` ? – cruisepandey Apr 06 '22 at 17:27
  • It was present initially because I simply copied and pasted it to test it out without changing the values. I replaced the values later when I realized what happened, but the horrible xpath is the xpath for the actual dynamic content. – Awsmike Apr 06 '22 at 19:57
  • Just following up... would you happen to know the correct code with the horrible xpath? – Awsmike Apr 09 '22 at 18:13