0

I'm working on a C# editor and currently focusing on its syntax highlighting system.

My problem is that the overlaps are overlapping! For example:

I want to write this comment in the editor: @* double quote: " *@

But because double quote is known as the character that makes the string, the result looks like this:

comment-string bug

And also:

string-comment bug

Single line comment Regex code: SINGLE_LINE_COMMENTS = "@.*"

Multi line comment Regex code: MULTI_LINE_COMMENTS = @"@\*.*?(\*@|$)"

And finally string Regex code:

STRINGS = @"(\\|f|\\f|f\\)?\s*(""""|""(.|\\"")*?([^\\]""|$)|''|'(.|\\')*?([^\\]'|$))"

I know, its a little bit of monster. But this is what it is (and its also supports ' and " for strings, like python). If you want more explanation about it, I can explain.

Also, this is a code that C# finds Regex.Matches (for comments and strings):

MatchCollection singleLineComments = Regex.Matches(code_section.Text, SINGLE_LINE_COMMENTS);
MatchCollection multiLineComments = Regex.Matches(code_section.Text, MULTI_LINE_COMMENTS, RegexOptions.Singleline);
MatchCollection strings = Regex.Matches(code_section.Text, STRINGS, RegexOptions.Singleline);

And finally for highlighting, I loop through each one of MatchCollections and highlight the code section that I want to be highlighted

I want to say: Is there a way in C# or Regex (maybe) to say, for example: If it was in the comment section, don't apply the string highlighting and stuff like that.

  • 1
    The answers to [lexers vs parsers](https://stackoverflow.com/questions/2842809/lexers-vs-parsers) might give you ideas of how to do that. – Andrew Morton Jul 09 '21 at 17:55
  • [Balancing Groups](https://learn.microsoft.com/en-us/dotnet/standard/base-types/grouping-constructs-in-regular-expressions#balancing_group_definition) may help you. They allow to parse irregular grammars. – Alexander Petrov Jul 09 '21 at 18:02

0 Answers0