1

I am trying to extract all the texts in span into list, using the following HTML code from Selenium webdriver method:

['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']

Anyone expert know how to do it?

HTML:

<tr style="background-color:#999">
    <td><b style="white-space: nowrap;">table_num</b><enter code here/td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>1a</span>
                <span>1b</span>
                <span>1c</span>
                </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>2a</span>
                <span>     </span>
                <span>     </span>
           </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>3a</span>
                <span>3b</span>
                <span>3c</span>
            </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>4a</span>
                <span>     </span>
                <span>     </span>
            </span>
        </td>
</tr>
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
acube
  • 25
  • 5
  • Please share the code which you have written till now, we will try to resolve the problem which you are facing. – Swaroop Humane May 06 '22 at 15:16
  • Below is the code I have written. So far only managed to get result for 1a, 1b and 1c. Any expert to help? print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//td[contains(.,'table_num')]/following-sibling::td[1]"))).text) – acube May 06 '22 at 15:44

3 Answers3

0

Here is the way, use the below xpath which will give you all the required spans.

//span[contains(@style,"column")]/span

Once you have all the span, you have to extract text from it.

If there is empty text, then ignore or else add it in the list.

Swaroop Humane
  • 1,770
  • 1
  • 7
  • 17
  • Thanks! But do you know how I can code it so it extract all span texts (including empty text) from using 'table_num' instead? – acube May 06 '22 at 17:03
0

As per the HTML, to extract all the texts from the <span> elements into a list you have to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

  • Using CSS_SELECTOR and text attribute:

    driver.get("application url")
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])
    
  • Using XPATH and get_attribute("innerHTML"):

    driver.get("application url")     
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks. But how can I extract all span texts (including empty) from using the 'table_num' text instead? – acube May 07 '22 at 00:04
  • What does the code in the answer prints using css and xpath? – undetected Selenium May 07 '22 at 00:29
  • Hi, I actually realized I using "html_page = driver.page_source" to get the html code: I edit your code abit but still see errors: css: ---- html_page = driver.page_source print([my_elem.text for my_elem in WebDriverWait(html_page, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))]) ERROR: AttributeError: 'str' object has no attribute 'find_elements' – acube May 07 '22 at 00:59
  • xpath: ------ html_page = driver.page_source print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(html_page, 30).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))]) ERROR: AttributeError: 'str' object has no attribute 'find_elements' – acube May 07 '22 at 01:00
  • Do you know the problem? – acube May 07 '22 at 01:04
  • You don't have to `html_page = driver.page_source`, simply access the page. I've updated the answer accordingly. – undetected Selenium May 07 '22 at 02:09
  • Now I run the CSS code and got this error. You know the problem? ERROR-1: Exception in Tkinter callback Traceback (most recent call last): File "C:\Users\Al PC\AppData\Local\Programs\Python\Python310\lib\tkinter\__init__.py", line 1921, in __call__ return self.func(*args) File "C:\Users\Al PC\PycharmProjects\Fi\fi.py", line 524, in py_and_pt_click – acube May 07 '22 at 03:39
  • ERROR-1-1: print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))]) File "C:\Users\Al PC\PycharmProjects\SocialMedia\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: – acube May 07 '22 at 03:39
  • I try 60 delay also same error. You know the problem? – acube May 07 '22 at 03:52
0

Just remove the predicate [1] from XPath, so it becomes:

//td[contains(.,'table_num')]/following-sibling::td

En to be more precise you could use:

//td[contains(.,'table_num')]/following-sibling::td/span/span
Siebe Jongebloed
  • 3,906
  • 2
  • 14
  • 19