6

How do I match a pattern only if there isn't a specific character before it on the same line?

I have the following regex code:

pattern = @"(?<=^|[\s.(<;])(?<!//)(" + Regex.Escape(keyword) + @")(?=[\s.(>])";
replacement = "<span style='" + keywordStyle + "'>$1</span>";
code = Regex.Replace(code, pattern, replacement);

I would like to add a criteria to only match if there aren't 2 slashes before it on the same line (C# comment).

I played around with it, and modified the pattern:

pattern = @"(?<!\/\/)(?<=^|[\s.(<;])(?<!//)(" + Regex.Escape(keyword) + @")(?=[\s.(>])";

But apparently this only works if the 2 slashes are 2 characters right before the keyword.

So this pattern wouldn't match "//foreach", but would match "// foreach".

Can negative look-behinds be used in this case, or can I accomplish this some other way, besides negative look-behinds?

Thank you.

EDIT:

Guess I wasn't clear enough. To reiterate my problem:

I'm working on syntax highlighting, and I need to find matches for c# keywords, like "foreach". However, I also need to take into account comments, which are defined by 2 slashes. I don't want to match the keyword "foreach" if it is part of a comment (2 slashes anywhere before it on the same line.

The negative lookbehind doesn't help me in this case because the slashes will not necessarily be right before the keyword, for example "// some text foreach" - I don't want this foreach to match.

So again, my question is: How can modify my pattern to only match if 2 slashes aren't anywhere before it on the same line?

Hope my question is clear now.

Rivka
  • 2,172
  • 10
  • 45
  • 74
  • I hope it isn't an [XY-problem](http://www.perlmonks.org/index.pl?node_id=542341) ? – L.B Aug 12 '12 at 19:56
  • Possibly - hence my question "...or can I accomplish this some other way". So any idea what Y would be in this case? :) – Rivka Aug 12 '12 at 19:59
  • 1
    `or can I accomplish this some other way` What do you want to accomplish? What is your *real* problem? – L.B Aug 12 '12 at 20:01
  • 1
    [The Roslyn Project](http://msdn.microsoft.com/en-us/vstudio/hh500769.aspx) and [C# and VB.NET Code Searcher - Using Roslyn](http://www.codeproject.com/Articles/416472/Csharp-and-VB-NET-Code-Searcher-Using-Roslyn) These are the answers for **X** :) – L.B Aug 12 '12 at 20:26
  • You're coming at this the wrong way. In a syntax highlighter, once any rule is done marking a given section of text, no other rule has any business looking at it again. If something that looks like a keyword appears inside a comment or a string literal, your "keyword" rule should never even see it. It should already have been handled by the "comment" or "string literal" rule. – Alan Moore Aug 13 '12 at 00:13
  • Thank you L.B., I'm the author of C# and VB.NET Code Searcher - Using Roslyn... By the way, the tool does searching .NET code, but I also use an editor that does syntax highlighting. In the CodeProject article http://www.codeproject.com/Articles/161871/Fast-Colored-TextBox-for-syntax-highlighting there are some great examples of Syntax Highlighting. – woutercx Aug 16 '12 at 21:45

4 Answers4

5

Simplifying your regex pattern a bit, what about the following? It makes use of the non-greedy match on "//" plus 0 or more characters thereafter.

(?<!//.*?)(?<Keyword>foreach)
David Andres
  • 31,351
  • 7
  • 46
  • 36
  • This worked for me - thank you. End result: *pattern = @"(?<!//.*?)((?<=^|[\s.(<;])(?<!//)(" + Regex.Escape(keyword) + @")(?=[\s.(>]))";* – Rivka Aug 13 '12 at 20:50
  • @Rivka: I'm glad this worked for you. One small caveat to be aware of if you ever have to transition to a different RegEx engine: not all engines support regular expressions within lookbehind expressions. .NET does, but Perl, to my knowledge, does not. – David Andres Aug 14 '12 at 12:05
  • good to know (although don't plan on changing in near future) - thanks! – Rivka Aug 14 '12 at 14:55
  • 1
    Tried for another similar application with ```module_line=re.compile(r'(?<!//.*?)module')```. Gets the error: ```error: look-behind requires fixed-width pattern```. Error goes away if I remove the ```.*``` . How shall we solve it? (Tried with Python re in Ubuntu 16.04 LTS). – vineeshvs May 01 '19 at 12:25
1

Without knowing exactly what you're attempting it's hard to say the best solution but most likely it's simply checking the beginning of the line for // before you bother trying the regex, especially if there can be more than one keyword per line.

midgetspy
  • 669
  • 1
  • 5
  • 8
1

Try this:

^\s*(?<!//.*)\s*foreach

for c# code analysis try reliable and opensource Irony - .NET Language Implementation Kit from codeplex.

Ria
  • 10,237
  • 3
  • 33
  • 60
  • Or these could be used to do c# code analysis: http://www.ndepend.com/Doc_CQLinq_Syntax.aspx or http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code – woutercx Aug 16 '12 at 21:41
1

If you're doing things with Syntax Highlighting, you really should take a look at this CodeProject article: Fast Colored TextBox for Syntax Highlighting This project is about a Code Editor window that does syntax highlighting too, and it uses regular expressions. Maybe it does what you need (and maybe more). It seems like the author of this has given a lot of thought to the Syntax Highlighting. I tried the foreach that you talked about here, and the "foreach" if it is part of a comment, and it displayed nicely.

woutercx
  • 195
  • 8
  • I looked at the source code of this article, and it does seem a lot was put in to it. I wanted to go through this on my own though - as opposed to using an already built program - as it's a learning process for me (and a little more than I need right now). It's a good reference, though - thank you. – Rivka Aug 17 '12 at 09:56