How to get the text using selenium in Python with geckodriver

Question

I have the following html code:

<div class="jaHlC">
<div class="C" data-ft="true">
<div class="IuRIu"
<span>
<span class="biGQs _P fiohW uuBRH">
90 places sorted by traveler favorites</span>
</span>
<span class="nzZVd PJ">

I need to extract the text saying "90 places sorted by traveler favorites"

My python code is the following which does not work to extract the text:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://www.tripadvisor.com/Attraction_Products-g28922-t21629-zfg21594-Alabama.html"

driver = webdriver.Firefox()
driver.get(url)

WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.ID, 'onetrust-accept-btn-handler'))).click()

# attempt 1 : does not work
#number = driver.find_element(By.XPATH, '//span[@class="biGQs _P fiohW uuBRH"]')

# attempt 2: does not work
#number = driver.find_element(By.XPATH, "/html/body/div[1]/main/div[1]/div/div[3]/div/div[2]/div[2]/div[2]/div/div/div[2]/div/div[2]/div/div/section[2]/div/div/div/span[1]/span")

# attempt 3: does not work either
number = driver.find_element(By.CSS_SELECTOR, "span.uuBRH")

Please suggest how I can extract the text. Thank you in advance.

score 1 · Answer 1 · answered Jun 13 '23 at 17:31

They use multiple classes for a single element and those classes can change dynamically, so searching for elements with a specific class name may fail if the class name changes.

if we want to find the element by using only part of the class name we could do something like the code below, it will ill wait up to 10 seconds for the element to become present. If it doesn't appear within 10 seconds, a TimeoutException will be raised

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)
number = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "span[class*='fiohW uuBRH']")))
print(number.text)

score 1 · Accepted Answer · answered Jun 13 '23 at 19:30

The classname attribute values like biGQs, fiohW, uuBRH, etc, are dynamically generated and is bound to change sooner/later. They may change next time you access the application afresh or even while next application startup. So can't be used in locators.

Solution

To extract the text 90 places sorted by traveler favorites ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

driver.get("https://www.tripadvisor.com/Attraction_Products-g28922-t21629-zfg21594-Alabama.html")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "section[data-automation=WebPresentation_WebSortDisclaimer] div > span > span"))).text)

Using XPATH and get_attribute("innerHTML"):

driver.get("https://www.tripadvisor.com/Attraction_Products-g28922-t21629-zfg21594-Alabama.html")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[@data-automation='WebPresentation_WebSortDisclaimer']//div/span/span"))).get_attribute("innerHTML"))

Console output:
```
90 places sorted by traveler favorites
```

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

References

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Thank you so much @undetected for your response. Your solution works perfectly. Thank you.. — R Sandy, Jun 14 '23 at 07:51

How to get the text using selenium in Python with geckodriver

2 Answers2

Solution

References