44

I would like to parse an HTML file with Python, and the module I am using is BeautifulSoup.

It is said that the function find_all is the same as findAll. I've tried both of them, but I believe they are different:

import urllib, urllib2, cookielib
from BeautifulSoup import *
site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"

rqstr = urllib2.Request(site)
rq = urllib2.urlopen(rqstr)
fchData = rq.read()

soup = BeautifulSoup(fchData)

t = soup.findAll('tr')

Can anyone tell me the difference?

daaawx
  • 3,273
  • 2
  • 17
  • 16
Oberon
  • 628
  • 1
  • 6
  • 12
  • 2
    which version of beautifulsoup are you using? If you're supposed to use BS4, then import should be `from bs4 import BeautifulSoup`. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#porting-code-to-bs4 – marchelbling Sep 09 '12 at 13:28
  • 1
    What is the difference? I mean, you said you used both and you saw a difference. Could you post some output that shows the different behaviour? Or are you asking why there are two methods that does the same thing? In that case Martijn Pieters is correct. – Bakuriu Sep 09 '12 at 19:40
  • find_all : it couldn't find the module findAll : it found several parts of html code. – Oberon Sep 09 '12 at 22:46

2 Answers2

75

In BeautifulSoup version 4, the methods are exactly the same; the mixed-case versions (findAll, findAllNext, nextSibling, etc.) have all been renamed to conform to the Python style guide, but the old names are still available to make porting easier. See Method Names for a full list.

In new code, you should use the lowercase versions, so find_all, etc.

In your example however, you are using BeautifulSoup version 3 (discontinued since March 2012, don't use it if you can help it), where only findAll() is available. Unknown attribute names (such as .find_all, which only is available in BeautifulSoup 4) are treated as if you are searching for a tag by that name. There is no <find_all> tag in your document, so None is returned for that.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
11

from the source code of BeautifulSoup:

http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/bs4/element.py#L1260

def find_all(self, name=None, attrs={}, recursive=True, text=None,
                 limit=None, **kwargs):
# ...
# ...

findAll = find_all       # BS3
findChildren = find_all  # BS2
kmonsoor
  • 7,600
  • 7
  • 41
  • 55