I'm parsing CSS3 selectors using a regex. For example, the selector a>b,c+d
is broken down into:
Selector:
a>b
c+d
SOSS:
a
b
c
d
TypeSelector:
a
b
c
d
Identifier:
a
b
c
d
Combinator:
>
+
The problem is, for example, I don't know which selector the >
combinator belongs to. The Selector
Group has 2 captures (as shown above), each containing 1 combinator. I want to know what that combinator is for that capture.
Groups have lists of Captures, but Captures don't have lists of Groups found in that Capture. Is there a way around this, or should I just re-parse each selector?
Edit: Each capture does give you the index of where the match occurred though... maybe I could use that information to determine what belongs to what?
So you don't think I'm insane, the syntax is actually quite simple, using my special dict class:
var flex = new FlexDict
{
{"GOS"/*Group of Selectors*/, @"^\s*{Selector}(\s*,\s*{Selector})*\s*$"},
{"Selector", @"{SOSS}(\s*{Combinator}\s*{SOSS})*{PseudoElement}?"},
{"SOSS"/*Sequence of Simple Selectors*/, @"({TypeSelector}|{UniversalSelector}){SimpleSelector}*|{SimpleSelector}+"},
{"SimpleSelector", @"{AttributeSelector}|{ClassSelector}|{IDSelector}|{PseudoSelector}"},
{"TypeSelector", @"{Identifier}"},
{"UniversalSelector", @"\*"},
{"AttributeSelector", @"\[\s*{Identifier}(\s*{ComparisonOperator}\s*{AttributeValue})?\s*\]"},
{"ClassSelector", @"\.{Identifier}"},
{"IDSelector", @"#{Identifier}"},
{"PseudoSelector", @":{Identifier}{PseudoArgs}?"},
{"PseudoElement", @"::{Identifier}"},
{"PseudoArgs", @"\([^)]*\)"},
{"ComparisonOperator", @"[~^$*|]?="},
{"Combinator", @"[ >+~]"},
{"Identifier", @"-?[a-zA-Z\u00A0-\uFFFF_][a-zA-Z\u00A0-\uFFFF_0-9-]*"},
{"AttributeValue", @"{Identifier}|{String}"},
{"String", @""".*?(?<!\\)""|'.*?(?<!\\)'"},
};