Conditional operators in Beautiful Soup findAll by attribute value

Question

I want to find all of tds that don't have a custom html attribute data-stat="randomValue"
My data looks something like this:

<td data-stat="foo">10</td>
<td data-stat="bar">20</td>
<td data-stat="test">30</td>
<td data-stat="DUMMY"> </td>

I know that I can just select for foo, bar, and test but my actual dataset will have hunders of different values for data-set so it just wouldn't be feasible to code.

Is there something like a != operator that I can use in beautiful soup? I tried doing:

[td.getText() for td in rows[i].findAll('td:not([data-stat="DUMMY"])')]

but I only get [] as a value.

Does this answer your question? [Exclude unwanted tag on Beautifulsoup Python](https://stackoverflow.com/questions/40760441/exclude-unwanted-tag-on-beautifulsoup-python) — Michael Ruth, Sep 02 '22 at 22:46
`.findall()` doesn't accept CSS selector syntax. Use `.select()` — Barmar, Sep 02 '22 at 22:48

score 1 · Accepted Answer · answered Sep 02 '22 at 22:51

1

You can use list comprehension to filter out the unvanted tags, for example:

print([td.text for td in soup.find_all("td") if td.get("data-stat") != "DUMMY"])

Or use CSS selector with .select (as @Barmar said in comments, .find_all doesn't accept CSS selectors):

print([td.text for td in soup.select('td:not([data-stat="DUMMY"])')])

answered Sep 02 '22 at 22:51

Andrej Kesely

168,389
15
48
91

thanks! This works but what is the difference between the find all and select solutions? – theloosygoose Sep 02 '22 at 22:53
@theloosygoose `.select` accepts CSS selectors, `.find_all` doesn't. The difference is only the API (for example, you can use `lambda` in `.find_all` etc.) – Andrej Kesely Sep 02 '22 at 22:54

Conditional operators in Beautiful Soup findAll by attribute value

1 Answers1