1

I noticed that when i retrieve texts from all html div tags in a loop fashion, the retrieved text is not always identical with the the original. What i mean is that if my text has multiple spaces (tab), they are trimmed.

For example consider that i have the following text: 'Hello_____World', how can i retrieve this instead of 'Hello_World'?

The code approach:

for flower in flowers:
  print(flower.text)

Generally, it is good to retrieve text without redundant spaces, but it is difficult to query the database with slightly different text and i think that it isn't desirable to query the database with part of text.

To conclude, is there any way to retrieve the text as it is, without trimmed spaces in selenium through python?

George
  • 33
  • 1
  • could you add an example website – PDHide Jan 15 '21 at 07:34
  • Generally, the tested Website is in development and it is local now. The fact is that i retrieve all flowers with the command: flowers=driver.find_elements_by_xpath("//td[@class='flowerpicker_listtd']") – George Jan 15 '21 at 07:50
  • Add any website with similar behavior – PDHide Jan 15 '21 at 07:50
  • See https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python. @rosstripi You can adapt the list for replece if neeeded. – Gaj Julije Jan 15 '21 at 09:04

1 Answers1

1

To extract all the texts using Selenium and you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("innerHTML"):

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "td.flowerpicker_listtd")))])
    
  • Using XPATH and text attribute:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//td[@class='flowerpicker_listtd']")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks for the compact solution, but the problem remains. My problem is that retrieved text is not the same as the original. When I type driver.find_element_by_xpath("//td[@class=flowerpicker__listtd' and text()='san tosa']").click(), I get that sant osa not found due to space trim – George Jan 15 '21 at 17:02
  • @George This answer was a precise solution for the `visibility_of_all_elements_located` where as, the question in your comment points to `click()` activity for which there would be a different approach. Sounds like a completely different issue all together. Can you raise a new question as per your new requirement? Stackoverflow contributors will be happy to help you out. – undetected Selenium Jan 16 '21 at 00:09