How to extract multiple texts from span elements using python selenium?

Question

I am trying to extract all the texts in span into list, using the following HTML code from Selenium webdriver method:

['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']

Anyone expert know how to do it?

HTML:

<tr style="background-color:#999">
    <td><b style="white-space: nowrap;">table_num</b><enter code here/td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>1a</span>
                <span>1b</span>
                <span>1c</span>
                </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>2a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
           </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>3a</span>
                <span>3b</span>
                <span>3c</span>
            </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>4a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
            </span>
        </td>
</tr>

Please share the code which you have written till now, we will try to resolve the problem which you are facing. — Swaroop Humane, May 06 '22 at 15:16
Below is the code I have written. So far only managed to get result for 1a, 1b and 1c. Any expert to help? print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//td[contains(.,'table_num')]/following-sibling::td[1]"))).text) — acube, May 06 '22 at 15:44

Swaroop Humane · Answer 1 · 2022-05-06T16:03:20.403

0

Here is the way, use the below xpath which will give you all the required spans.

//span[contains(@style,"column")]/span

Once you have all the span, you have to extract text from it.

If there is empty text, then ignore or else add it in the list.

edited May 06 '22 at 16:03

answered May 06 '22 at 15:52

Swaroop Humane

1,770
1
7
17

Thanks! But do you know how I can code it so it extract all span texts (including empty text) from using 'table_num' instead? – acube May 06 '22 at 17:03

undetected Selenium · Answer 2 · 2022-05-07T02:08:38.147

0

As per the HTML, to extract all the texts from the <span> elements into a list you have to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

driver.get("application url")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])

Using XPATH and get_attribute("innerHTML"):

driver.get("application url")     
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])

edited May 07 '22 at 02:08

answered May 06 '22 at 23:11

undetected Selenium

183,867
41
278
352

Thanks. But how can I extract all span texts (including empty) from using the 'table_num' text instead? – acube May 07 '22 at 00:04
What does the code in the answer prints using css and xpath? – undetected Selenium May 07 '22 at 00:29
Hi, I actually realized I using "html_page = driver.page_source" to get the html code: I edit your code abit but still see errors: css: ---- html_page = driver.page_source print([my_elem.text for my_elem in WebDriverWait(html_page, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))]) ERROR: AttributeError: 'str' object has no attribute 'find_elements' – acube May 07 '22 at 00:59
xpath: ------ html_page = driver.page_source print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(html_page, 30).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))]) ERROR: AttributeError: 'str' object has no attribute 'find_elements' – acube May 07 '22 at 01:00
Do you know the problem? – acube May 07 '22 at 01:04
You don't have to `html_page = driver.page_source`, simply access the page. I've updated the answer accordingly. – undetected Selenium May 07 '22 at 02:09
Now I run the CSS code and got this error. You know the problem? ERROR-1: Exception in Tkinter callback Traceback (most recent call last): File "C:\Users\Al PC\AppData\Local\Programs\Python\Python310\lib\tkinter\__init__.py", line 1921, in __call__ return self.func(*args) File "C:\Users\Al PC\PycharmProjects\Fi\fi.py", line 524, in py_and_pt_click – acube May 07 '22 at 03:39
ERROR-1-1: print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))]) File "C:\Users\Al PC\PycharmProjects\SocialMedia\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: – acube May 07 '22 at 03:39
I try 60 delay also same error. You know the problem? – acube May 07 '22 at 03:52

score 0 · Answer 3 · answered May 11 '22 at 11:13

0

Just remove the predicate [1] from XPath, so it becomes:

//td[contains(.,'table_num')]/following-sibling::td

En to be more precise you could use:

//td[contains(.,'table_num')]/following-sibling::td/span/span

answered May 11 '22 at 11:13

Siebe Jongebloed

3,906
2
14
19

How to extract multiple texts from span elements using python selenium?

3 Answers3