4

I´m trying to get the text inside a /a tag in a nested ul-li structure. I locate all the "li", but can´t get the text inside a's.

I´m using Python 3.7 and Selenium webdriver with Firefox driver.

The corresponding HTML is:

[some HTML]

<ul class="dropdown-menu inner">
<!---->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option first-in-group group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 1</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 2</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 3</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT4</a>
    </li>
                            [another 100 <li></li> similar blocks]                  .
                                                .
    <li class="no-search-result" placeholder="Curso">
        <span>Unimportant TEXT</span>
    </li>
</ul>

[more HTML]

I´ve tried the code below:

cursos = browser.find_elements_by_xpath('//li[@nya-bs-option="curso in ctrl.cursos group by curso.grupo"]')
nome_curso = [curso.find_element_by_tag_name('a').text for curso in cursos]

I get the list with the correct number of items, but all of them = ''. Can anyone help me? Thks.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

1 Answers1

1

Seems you were close. To extract the texts, e.g. Important TEXT 1, Important TEXT 2, Important TEXT 3, Important TEXT4, etc you have to induce WebDriverWait for the desired visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute() method:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "ul.dropdown-menu.inner li.nya-bs-option a")))])
    
  • Using XPATH and text attribute:

    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//ul[@class='dropdown-menu inner']//li[contains(@class, 'nya-bs-option')]//a")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the title attribute through Selenium using Python?


Outro

As per the documentation:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352