0

I'm trying to get the text inside these div's but I'm not succeeding. They have a code between the class that doesn't change with each execution.

<div data-v-a5b90146="" class="html-content"> Text to be captured</div>
            

<div data-v-a5b90146="" class="html-content"><b> TEXT WANTED </b><div><br></div>

I've tried with XPATH, but I was not successful too.

content = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '/html/body/div/div/div/div/div/div/div[1]/div[2]/div[2]/div[4]/div/div/b'))).text
Silvio Duarte
  • 41
  • 1
  • 5
  • What's the error? Also please add complete html or the url so that the xpath can be verified – Manish Jan 30 '23 at 22:00
  • are you sure your xpath correct? also it is not recommended to use such full xpath selectors. Can you give more details of your HTML, pls? – Mahsum Akbas Jan 31 '23 at 09:12

3 Answers3

1

You need to change couple of things.

  1. presence_of_all_elements_located() returns list of elements, so you can't use .text with a list. To get the text value of the element you need to iterate and then get the text.

  2. xpath seems very fragile. you should use relative xpath since class name is unique you can use the classname.

your code should be like.

contents = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//div[@class="html-content"][@data-v-a5b90146]')))
for content in contents:
   print(content.text)

You can use visibility_of_all_elements_located() as well

 contents = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="html-content"][@data-v-a5b90146]')))
 for content in contents:
     print(content.text)
KunduK
  • 32,888
  • 5
  • 17
  • 41
  • Thanks. It worked. However, it does return a list of multiple strings. The text I want is on line 7, but when I write list[6].text I get an error. How to access a specific list element? – Silvio Duarte Jan 31 '23 at 00:22
  • Could you print your list how it’s look like. Without looking at the list I can’t say what’s the issue? – KunduK Jan 31 '23 at 00:27
  • Through a for loop I was able to add the items to another normal list. The list that selenium returns is a specific type that does not respond to traditional python commands, as I could see. – Silvio Duarte Feb 01 '23 at 17:40
1

Both the <div> tags have the attribute class="html-content"


Solution

To extract the texts from the <div> tags instead of presence_of_all_elements_located(), you have to induce WebDriverWait for visibility_of_all_elements_located() and using list comprehension and you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.html-content")))])
    
  • Using XPATH:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='html-content']")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
-1

Try to get the element first then get the text from the element;

    element = WebDriverWait(
    driver,
    10
).until(
    EC.presence_of_all_elements_located((
        By.XPATH, '/html/body/div/div/div/div/div/div/div[1]/div[2]/div[2]/div[4]/div/div/b'
    ))
)
content = element.text
Ricky S
  • 9
  • 3