I am trying to parse the file browser Thunar's custom actions files (~/.config/Thunar/uca.xml
) with the lxml
Python module.
For some reason, Thunar obviously writes a malformed declaration
into these files:
<?xml encoding="UTF-8" version="1.0"?>
Obviously, the version
is expected to appear as the first "attribute" in the declaration. lxml
raises an XMLSyntaxError
if I try to parse the file.
And no, I cannot simply correct the declaration, becaue Thunar keeps overwriting it with the bogus one.
This might very likely be a bug in Thunar.
Nevertheless, I would like to know how to ignore the XML declaration with lxml
.
I know that I could pre-process the XML document to filter out the XML declaration. But this doesn't seem very elegant. Since XML seems to default to version 1.0 and UTF-8 encoding, there surely is a possibility to just ignore the declaration and assume that in lxml
. I didn't find anything in the documentation or on google, I might have overlooked something.