2

I'm new to python and webscraping so I'm not sure what the name of the value inbetween the <div>'s in an element is called. Sorry for not being able to specify.

<div class="syllable">value</div>

Is there a way to have the value inbetween the <div>'s get assigned to a string variable in python using selenium using XPath? For example, the "value" in the element would be a string and it would print out:

value

I'm new to python and selenium so I can't figure it out.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Kip
  • 81
  • 1
  • 5

3 Answers3

3

To print out the text of the element.

elem=driver.find_element_by_class_name("syllable")
print(elem.text)

xpath:

elem=driver.find_element_by_xpath("//div[@class='syllable']/text()")
print(elem)
Arundeep Chohan
  • 9,779
  • 5
  • 15
  • 32
2

it is called html innerText

you can retrieve this value using text in selenium , or get_attribute.

This returns the rendered text (means displayed text)

elem=driver.find_element_by_class_name("syllable")
print(elem.text)

This return the text with out checking the style attribute meaning returns value even if its not displayed in UI

elem=driver.find_element_by_class_name("syllable")
print(elem.get_attribute("textContent")

you can find elem using this text also:

// partial match
elem=driver.find_element_by_xpath("//div[contains(text(),'value')])
print(elem.text)

// exact match 
elem=driver.find_element_by_xpath("//div[text()='value')])
print(elem.text)

// exact match of the elements text if there is any child element like span it won't return the element
elem=driver.find_element_by_xpath("//div[.='value')])
print(elem.text)

Also note:

Other things you could read about outerHTML , innerHTML

PDHide
  • 18,113
  • 2
  • 31
  • 46
2

To print the text value you can use either of the following Locator Strategies:

  • Using class_name and get_attribute("textContent"):

    print(driver.find_element_by_class_name("syllable").get_attribute("textContent"))
    
  • Using css_selector and get_attribute("innerHTML"):

    print(driver.find_element_by_css_selector("div.syllable").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//div[@class='syllable']").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CLASS_NAME and get_attribute("textContent"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CLASS_NAME, "syllable"))).get_attribute("textContent"))
    
  • Using CSS_SELECTOR and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.syllable"))).text)
    
  • Using XPATH and get_attribute():

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='syllable']"))).get_attribute("innerHTML"))
    
  • Console Output:

    value
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352