0

I want to extract all <li> element text that are under <ul> for which I tried

elem = driver.find_elements_by_xpath(("//div[@class='left width50']/p/b/ul"))
len(elem)

gives '0' or empty list.

here is the html source

<div class="left width50">
                            <p><b>Features:</b></p>
                            <ul>
                                    <li>Easy spray application</li>
                                    <li>Excellent bonding properties</li>
                                    <li>Single package</li>
                                    <li>Mixed with clean potable water at job site</li>
                            </ul>
                        </div>

HERE is the link of the website

How to go about it any suggestions?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Andre_k
  • 1,680
  • 3
  • 18
  • 41
  • remove `/p/b` this is not needed. Or use this as css selector - `#borderForGrid > div.left.width50 > ul` – Kaushik Jun 03 '19 at 05:42
  • @Kaushik how do I make a use of `css selector` ? – Andre_k Jun 03 '19 at 05:45
  • `driver.find_element_by_css_selector('#borderForGrid > div.left.width50 > ul')` Read the [link](https://selenium-python.readthedocs.io/locating-elements.html#locating-elements-by-css-selectors) also – Kaushik Jun 03 '19 at 05:47
  • I used Xpath `a=driver.find_element_by_xpath('//*[@id="borderForGrid"]/div[1]/ul')` , but it has ' ' elements – Andre_k Jun 03 '19 at 05:56
  • You need to iterate it to get the `li` values. – Kaushik Jun 03 '19 at 06:14
  • 1
    and for your kind information the page contains duplicate `ids` which is not a expected thing in normal web page. – Kaushik Jun 03 '19 at 06:21

3 Answers3

7

Actually you're trying to find the path after the p and b tag. that will look something like this.

<div class="left width50">
    <p><b>Features:<ul>
            <li>Easy spray application</li>
            <li>Excellent bonding properties</li>
            <li>Single package</li>
            <li>Mixed with clean potable water at job site</li>
    </ul></b></p>

</div>

But your code is different in HTML.

So you should look around without the p and b tag.

Here is the quick help you can take from chrome. Go to developer option using f12 key and navigate to elements tab and then right click on the element which you want to find out and select the selector value.

You can read more about what are the ways to find the element here

If you want to use the xPath this is right xpath for you - //*[@id="borderForGrid"]/div[1]/ul

Extraction Process

Once you'll get all the ul this will help you to get all the li text

all_li = all_ul_from_xpath.find_elements_by_tag_name("li")
for li in all_li:
    text = li.text
    print (text)

Working code for reference.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome()
driver.get("http://www.carboline.com/products/")



elem = driver.find_element_by_xpath('//*[@id="borderForGrid"]/div[1]/ul')

all_li = elem.find_elements_by_tag_name("li")
for li in all_li:
    text = li.text
    print (text)

Output

enter image description here

enter image description here

Kaushik
  • 2,072
  • 1
  • 23
  • 31
  • this gives correct answer, but how to iterate this for entire product list, should I change `//*[@id="borderForGrid"]/div[2]/ul` for every item? – Andre_k Jun 03 '19 at 06:29
1

Presumably, you wanted to extract all <li> element's text that are associated with <h5> tag with text as A/D TC-55 SEALER and to achieve that you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "li[data-brands='Southwest'][data-types='Acrylics'] div.left.width50 ul>li")))])
    
  • Using XPATH:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//h5//a[text()='A/D TC-55 SEALER']//following::div[1]//ul//li")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Considering your solution `print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//h5//a[text()='A/D TC-55 SEALER']//following::div[1]//ul//li")))])` is limited if the text is `A/D TC-55 SEALER` what if I want it for other text like `CARBOCRYLIC 3356-1` – Andre_k Jun 04 '19 at 06:39
  • @deepesh Observe the HTML you have provided, which exclusively points towards **A/D TC-55 SEALER** section. Hence was my answer. Glad that you got an accepted solution. – undetected Selenium Jun 04 '19 at 06:55
  • Thank you @DebanjanB !, I solved my problem using hints from **Kaushik** answer, but it seems it doesn't work when those list are empty. so I came to using your solution, which specifically uses the text name. – Andre_k Jun 04 '19 at 06:58
0

There's is no element with the xpath :

//div[@class='left width50']/p/b/ul 

left width50 has 500 web element associated with it. So does //div[@class='left width50']/p/b

That's why you are getting 0 while doing len().

Instead try replacing it with this xpath

//a[text()='A/D Firefilm III']/../following-sibling::div[1]/descendant::li
cruisepandey
  • 28,520
  • 6
  • 20
  • 38