I have an element in a page that looks like this:
<a id="cid-694094:Comment:188384" name="694094:Comment:188384"></a>
If you do document.cssselect("#cid-694094:Comment:188384")
you will get:
lxml.cssselect.ExpressionError: The psuedo-class Symbol(u'Comment', 12) is unknown
The solution for that is handled in this question (the person was using Java).
However, when I try that in Python as such:
document.cssselect(r"#cid-694094\:Comment\:188384")
I get:
lxml.cssselect.SelectorSyntaxError: Bad symbol 'cid-694094\': 'unicodeescape' codec can't decode byte 0x5c in position 10: \ at end of string at [Token(u'#', 0)] -> None
The reason for that and a proposed solution can be found in this question. If I understand it correctly I should be doing:
document.cssselect(r"#cid-694094\\:Comment\\:188384")
But this still doesn't work. Instead I once again get:
lxml.cssselect.ExpressionError: The psuedo-class Symbol(u'Comment\', 14) is unknown
Can anybody tell me what I'm doing wrong?
Try it yourself using:
import lxml.html
document = lxml.html.fromstring(
'<a id="cid-694094:Comment:188384" name="694094:Comment:188384"></a>'
)
document.cssselect(r"#cid-694094\:Comment\:188384")