extracting information from an element of html file

Question

I want to extract 402 from the following string. I am using beautiful soup .

<span class="bla bla bla"> <span class="ba1 ba1">  </span>402.00</span>

I tried using strip() but the element bs4.element.resultset doesnt allow this.

Please suggest as to how can I do it ??

any pointers would be appreciated

score 4 · Answer 1 · answered Sep 09 '14 at 13:22

4

Find the inner span and get the next_sibling:

soup.find('span', class_='bla').find('span', class_='ba1').next_sibling

Demo:

>>> from bs4 import BeautifulSoup
>>> data = '<span class="bla bla bla"> <span class="ba1 ba1">  </span>402.00</span>'
>>> soup = BeautifulSoup(data)
>>> soup.find('span', class_='bla').find('span', class_='ba1').next_sibling
u'402.00'

answered Sep 09 '14 at 13:22

alecxe

462,703
120
1,088
1,195

for the element soup I am getting an error ---AttributeError: 'NoneType' object has no attribute 'find'. – ayush biyani Sep 09 '14 at 13:50
@ayushbiyani well, you've provided the HTML code and the solution I've provided works for it. Provide the real HTML you are parsing (relevant part including `span` tags and the desired data). Thanks. – alecxe Sep 09 '14 at 14:00
1

@ayush biyani, maybe you use a former versions of BeautifulSoup that didn't handle that easily the 'class' attribute, see http://stackoverflow.com/questions/5041008/handling-class-attribute-in-beautifulsoup – sebdelsol Sep 09 '14 at 14:21
BTW, you could get a quicker result with soup.find('span', class_='bla').getText() – sebdelsol Sep 09 '14 at 14:22
@sebdelsol really good point about the version of `BeautifulSoup`. I bet this is it. – alecxe Sep 09 '14 at 14:26
402.00 – ayush biyani Sep 09 '14 at 15:24
d.find("span",{"class" : "bld lrg red"}).find("span",{"class" : "currencyINR"}).next_sibling Traceback (most recent call last): File "", line 1, in d.find("span",{"class" : "bld lrg red"}).find("span",{"class" : "currencyINR"}).next_sibling AttributeError: 'NoneType' object has no attribute 'find' – ayush biyani Sep 09 '14 at 16:00

extracting information from an element of html file

1 Answers1