0
<table id="t_id" cellspacing="0" border="0" align="center" height="700" width="600" cellpadding="0">
<tbody>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
</tbody>
</table>
Chris B.
  • 85,731
  • 25
  • 98
  • 139
user12345
  • 2,400
  • 8
  • 33
  • 40

2 Answers2

3

People tend to prefer lxml these days over BeautifulSoup. See how easy this is:

from lxml import etree
data = """<table id="t_id" cellspacing="0" border="0" align="center" height="700" width="600" cellpadding="0">
<tbody>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
</tbody>
</table>
"""
tree = etree.fromstring(data)
table_element = tree.xpath("/table")[0] # because it returns a list of table elements
print table_element.attrib['height'] + " and " + table_element.attrib['width']
Uku Loskit
  • 40,868
  • 9
  • 92
  • 93
  • 1
    Why do people prefer lxml? Performance reasons? Because the BeautifulSoup solution is shorter and looks more pythonic IMHO. – Dzinx Feb 10 '11 at 16:13
  • 1
    I'm a fan of BeautifulSoup too, but it does look like it's going the way of the dodo: http://stackoverflow.com/questions/1922032/parsing-html-in-python-lxml-or-beautifulsoup-which-of-these-is-better-for-what/1922064#1922064 – RJ Regenold Feb 10 '11 at 16:24
  • You can still use beautiful soup without problems if you're not building anything "critical". However, there are a lot of changes in the latest (3.1.0) version. If you want to use BS, I would recommend to use the 3.0.8 one. – Herberth Amaral Feb 10 '11 at 18:16
1

If this is your whole HTML, then this will suffice:

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup("...your HTML...")
print soup.table['width'], soup.table['height']
# prints: 600 700

If you need to search for the table first, it's not much more complicated, either:

table = soup.find('table', id='t_id')
print table['width'], table['height']
Dzinx
  • 55,586
  • 10
  • 60
  • 78