2

Trying to get the contents of this type of html snippet using beautiful soup (it is a "tag" object).

<span class="font5"> arrives at this calculation from the Torah’s report that the deluge (rains) began on the 17<sup>th</sup> day of the second month </span>

I've tried:

soup.contents.find_all('span')
soup.find_all('span')
soup.find_all(re.compile("font[0-9]+"))
soup.string
soup.child

And none of these seem to be working. What can I do?

Ester Lin
  • 607
  • 1
  • 6
  • 20

3 Answers3

2

soup.find_all('span') does work; returns all span tags.

If you want to get span tag with font<N> class, specify the pattern as a keyword argument class_:

soup.find_all('span', class_=re.compile('font[0-9]+'))
falsetru
  • 357,413
  • 63
  • 732
  • 636
0

If starting with font is unique enough you can use also use a css selector looking for the class starting with font:

soup.select("span[class^=font]")
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0
print ''.join(soup.findAll(text=True))

(answered here)

Community
  • 1
  • 1
Ester Lin
  • 607
  • 1
  • 6
  • 20