I am trying to use lxml to validate a piece of HTML but it complains that the fragment is invalid even though it should be valid:
img = """<img src="http://api.com/?data=ey&ip=1&img=1" height="1" width="1">"""
parser = lxml.etree.HTMLParser(recover=False)
lxml.etree.parse(StringIO(img), parser)
raises:
XMLSyntaxError: htmlParseEntityRef: expecting ';', line 1, column 37
Changing the &
separating the parts of the query string to ;
seems to fix the issue but that should not be required. Using semicolons is a recommendation of the W3C.
Is there something I can do to get lxml to see this fragment as valid?