Am trying to parse and extract some information from a web page that contains CSS and of course HTML. I am using cssutils and beatifulsoup for this. Lets say I want to find out the font size used for a table heading. Beautifulsoup tells me where the table definition is in HTML. But if I want to know which style is used in the table do I get that information from BeatifulSoup? If not how do I go about solving this problem. Thanks for any help.
Asked
Active
Viewed 921 times
2
-
Can you give example code? – Abbasov Alexander Jul 03 '13 at 22:14
1 Answers
0
Yes you get it. BeautifulSoup is perfect the choice and with regular expression is strong power :)
Example:
import re from BeautifulSoup import BeautifulSoup soup = BeautifulSoup('<h1 style="font-size: 12px; margin: 5px">Test</h>') style = soup.find('h1')['style'] re.findall('font-size[^;]+', style) # [u'font-size: 12px']

Abbasov Alexander
- 1,848
- 1
- 18
- 27