1

I would like to scrape an HTML page where content is not static but loaded with javascript.

I downgrade Selenium to version 3.3.0 in order to be able to support PhantomJS (v4.9.x does not support PhantomJS anymore) and wrote this code:

from selenium import webdriver
driver = webdriver.PhantomJS('path-to-phantomJS')
driver.get('my_url')
p_element = driver.find_element_by_id(id_='my-id')
print(p_element)

The error I'm getting is:

selenium.common.exceptions.NoSuchElementException: Message: "errorMessage":"Unable to find element with id 'my-id'"

The element I want to return is tag <section> with a certain id and all its subtags. The HTML content is like that:

<section id="my-id" class="my-class">...</section>
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Alez
  • 1,913
  • 3
  • 18
  • 22
  • 1
    You need to wait for the JS to load the element. See [Waits](https://www.selenium.dev/documentation/webdriver/waits/) – Barmar Jun 13 '23 at 16:20
  • After the page loads what steps do you take manually for see the element with the id you are looking for? – Jortega Jun 13 '23 at 20:08

2 Answers2

1

This error message...

selenium.common.exceptions.NoSuchElementException: Message: "errorMessage":"Unable to find element with id 'my-id'

...implies that the element wasn't found within the HTML DOM.

The possible reason is that the desired WebElement didn't render within the Viewport as by default initializes with a minimized viewport.


Solution

You need to initialize PhantomJS with the maximized viewport inducing WebDriverWait for the visibility_of_element_located() while locating it as follows:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.PhantomJS('path-to-phantomJS')
driver.get('my_url')
driver.maximize_window()
p_element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "my-id")))
print(p_element)
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

This could be due to various reasons, such as the element not being present at the time the code is executed or the element having a different ID, but in case you double-checked the ID presence. I think you have to make sure that the page has finished loading before attempting to find the element. In certain cases, JS-based content may take a bit longer to load. You can add delays or an explicit wait to ensure that the element is available before accessing it

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

driver = webdriver.PhantomJS('path-to-phantomJS')
driver.get('my_url')
delay = 10  # Wait up to 10 seconds for the element to be present

try:
    wait = WebDriverWait(driver, delay)
    p_element = wait.until(EC.presence_of_element_located((By.ID, 'my-id')))
    print(p_element.text)
except TimeoutException:
    print("Timeout!")

Hope this helps!

Aymen Krifa
  • 93
  • 1
  • 5