0

I'm using the HTMLAgilityPack and I'm trying to select an element ID with a colon in it.

Using Fizzler.Systems.HtmlAgilityPack;

Test #1 (Unknown Pseudo Class)

HtmlNodeSelection.QuerySelectorAll( _htmlDocument.DocumentNode,"#unlocktheinbox:test");

Test #2 (Invalid Character at Position 16.)

HtmlNodeSelection.QuerySelectorAll( _htmlDocument.DocumentNode,"#unlocktheinbox\\:test");

Test #3 (Unrecognized escape sequence)

HtmlNodeSelection.QuerySelectorAll( _htmlDocument.DocumentNode,"#unlocktheinbox\3A test");

Test #4 (Invalid Character at Position 16.)

HtmlNodeSelection.QuerySelectorAll( _htmlDocument.DocumentNode,"#unlocktheinbox\\3A test");

What am I doing wrong?

Turns out I looked at the source code for Fizzler..

 // TODO Support full string syntax!
 //
 // string    {string1}|{string2}
 // string1   \"([^\n\r\f\\"]|\\{nl}|{nonascii}|{escape})*\"
 // string2   \'([^\n\r\f\\']|\\{nl}|{nonascii}|{escape})*\'
 // nonascii  [^\0-\177]
 // escape    {unicode}|\\[^\n\r\f0-9a-f]
 // unicode   \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?
 //

They don't support it yet :(

Henry
  • 2,953
  • 2
  • 21
  • 34

1 Answers1

2

\3A is a compile-time error because \3 is not a valid escape sequence in a C# string, so you need to escape the backslash. Using either \\: or \\3A is correct, but the selector engine appears to be having trouble with CSS escape sequences for whatever reason.

See if you can work around this with an attribute selector instead, which removes the need for escape sequences altogether:

HtmlNodeSelection.QuerySelectorAll(_htmlDocument.DocumentNode, "[id='unlocktheinbox:test']");
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356