2

Am trying to parse and extract some information from a web page that contains CSS and of course HTML. I am using cssutils and beatifulsoup for this. Lets say I want to find out the font size used for a table heading. Beautifulsoup tells me where the table definition is in HTML. But if I want to know which style is used in the table do I get that information from BeatifulSoup? If not how do I go about solving this problem. Thanks for any help.

R11
  • 405
  • 2
  • 6
  • 15

1 Answers1

0

Yes you get it. BeautifulSoup is perfect the choice and with regular expression is strong power :)

Example:

import re
from BeautifulSoup import BeautifulSoup


soup = BeautifulSoup('<h1 style="font-size: 12px; margin: 5px">Test</h>')
style = soup.find('h1')['style']
re.findall('font-size[^;]+', style)
# [u'font-size: 12px']
Abbasov Alexander
  • 1,848
  • 1
  • 18
  • 27