Match all of a specific raw table with Webdriver Selenium - Python

Question

I'm still new in web scraping and I have this question related to Webdriver.

Code Exemple :

<table>
    <tbody>
        <tr>
            <td> car </td>
            <td> bus </td>
        </tr>
       <tr>
            <td> car </td>
            <td> bus & train </td>
        </tr>
       <tr>
            <td> car </td>
            <td> bus & plane </td>
        </tr>
    </tbody>
</table>

<table>
    <tbody>
        <tr>
            <td> food </td>
            <td> meat</td>
        </tr>
       <tr>
            <td> drink </td>
            <td> water </td>
        </tr>
    </tbody>
</table>

So the idea is that in my original code, I have multiple tables with the same ID and class names.

Question : How can i proceed to extract all the TRs that contains the word "bus".

I can't find the correct xpath syntax to use.

undetected Selenium · Accepted Answer · 2021-01-07T13:38:29.170

To create a list of all the <tr> with their child <td> containing the text bus you can use the following xpath based Locator Strategies:

elements = driver.find_elements_by_xpath("//tr[.//td[contains(., 'bus')]]")

Ideally you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[.//td[contains(., 'bus')]]")))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

score 0 · Answer 2 · answered Jan 07 '21 at 08:09

Use beautifulsoup

html = "<table>
    <tbody>
        <tr>
            <td> car </td>
            <td> bus </td>
        </tr>
       <tr>
            <td> car </td>
            <td> bus & train </td>
        </tr>
       <tr>
            <td> car </td>
            <td> bus & plane </td>
        </tr>
    </tbody>
</table>

<table>
    <tbody>
        <tr>
            <td> food </td>
            <td> meat</td>
        </tr>
       <tr>
            <td> drink </td>
            <td> water </td>
        </tr>
    </tbody>
</table>"
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "lxml")

temp = soup.findAll("td") 

output = [x for x in temp if "bus" in x.text]

score 0 · Answer 3 · answered Jan 07 '21 at 08:22

0

//td[contains(text(),'bus')]

you can use contains text , this gives all td that has bus in it

answered Jan 07 '21 at 08:22

PDHide

18,113
2
31
46

Match all of a specific raw table with Webdriver Selenium - Python

3 Answers3

Linked