0

I will answer my own question: After writing up the question, I found the answer. The answer was in the docs, but I missed it, so writing it here at stackoverflow for reference.

Question: If BeautifulSoup's find_all() cannot find a particular class, why does it not return None?

html = """
    <div><p class="apple">apple</p></div>
"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
s = '<p class="banana">banana</p>'
p = soup.find_all('p', attrs={'class':'banana'})
print(type(p))
## bs4.element.ResultSet
if p is None:
    print("p is None, as expected")
else:
    s = soup.p.extract()
    print("p is not None... but why?") 
print(s)

## p is not None... but why?
## <p class="apple">apple</p>
PatrickT
  • 10,037
  • 9
  • 76
  • 111
  • The problem was not a subtlety about the class attributes, but simply that `find_all()` returns an empty list rather than `None` (see answer below). In retrospect I could have made a much simpler example, e.g. with an empty string `html = ''` and `soup.find_all('p')`. – PatrickT May 01 '20 at 18:22

1 Answers1

0

From the docs:

If find_all() can’t find anything, it returns an empty list. If find() can’t find anything, it returns None:

Conclusion: find() and find_all() behave differently in this respect!


In the conditional above,

if p is None:

may be replaced by:

if not p:

as explained here.

PatrickT
  • 10,037
  • 9
  • 76
  • 111