Here's what I have:
r = requests.get("http://www.cnn.com/")
htmlelement = lxml.html.fromstring(r.text)
html = lxml.html.tostring(htmlelement)
tree = lxml.etree.fromstring(html)
print tree.xpath('//*[@id="cnn_maintt1imgbul"]/div/div[2]/div/h1/a')
I thought xml.html corrected the broken html?
The error is:
XMLSyntaxError: Opening and ending tag mismatch: link line 32 and head, line 75, column 8
Thanks!