5

I am using a file search utility (FileSeek) with regex content search. The contents I am searching for is basically any un-commented lines that have while...each in them. I have successfully managed to exclude inline commented lines such as // while (list($key, $value) = each($_GET)) with this regex: ^(?:(?!\/\/).)*while.+[\s=(]each[\s(]

Demo

How can I improve the regex search (make it even more restrictive) to exclude search results from commented lines and commented code blocks \* *\ such as:

/*
  while (list($key, $value) = each($_GET))
*/

Or

/* some code
  while (list($key, $value) = each($_GET))
  some code
*/

In other words, how can I modify my regex to also completely skip/ignore everything inside a commented php block: \* *\ instead of picking up results that are also inside it?

EDIT: Just for reference, here is an example that does the opposite, ie. matches only commented code.

Nikita 웃
  • 2,042
  • 20
  • 45
  • You could modify your existing expression, use alternation and only capture what is not matched? See [here](https://regex101.com/r/dZQ9Ro/2) – Paolo Jul 29 '18 at 09:29
  • 1
    @UnbearableLightness Thanks. I am looking for a solution that is more restrictive than the regex I already posted in the question, which currently picks the `while...each` in the commented code block: https://regex101.com/r/pCQ3QC/1/ – Nikita 웃 Jul 29 '18 at 09:43

1 Answers1

2

You can use (*SKIP)(*FAIL) to skip parts together with this trick if supported by your tool.

(?:(?<!:)\/\/.*|\/\*[\s\S]*?\*\/)(*SKIP)(*F)|while.+?[\s=(]each[\s(]

See demo at regex101. This is just a quick try, you need to adjust the pattern to your needs.


If this is not supported by your tool, you can try to add another lookahead to your pattern.

^(?:(?!\/\/).)*while.+[\s=(]each[\s(](?!(?:(?!\/\*)[\S\s])*?\*\/)

With m multiline-mode turned on and s single line mode turned off.

Another demo at regex101


Or without any flags and used [^\n] instead of \N for compatibility.

(?<![^\n])(?:(?!\/\/)[^\r\n])*?while[^\r\n]+[\s=(]each[\s(](?!(?:(?!\/\*)[\S\s])*?\*\/)

One more demo at regex101

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • Thanks. The tool doesn't seem to support that. Any other way? – Nikita 웃 Jul 29 '18 at 13:24
  • Maybe the inline modifier, can you use [updated regex like this](https://regex101.com/r/YPOJcH/2)? – bobble bubble Jul 29 '18 at 13:26
  • When I am trying the exact regex you put in your demos, it says "RegEx pattern is invalid" unfortunately. It does work well in other engines tho, but not this tool. Probably doesn't accept SKIP, F. Any alternatives? – Nikita 웃 Jul 29 '18 at 13:28
  • Thanks a lot! The search does run this time, no error, but still brings up results that are inside commented blocks. What can be the reason? (I tried the same files content with your regex in regex101 and it doesn't match those blocks, but the tool still does) – Nikita 웃 Jul 29 '18 at 13:48
  • @CM웃 Hmm, no idea (: Experiment with the lookahead a bit. – bobble bubble Jul 29 '18 at 13:55
  • Thanks again. Definitely deserve my upvote, at the very least ;) (would have upvoted once more for your cool BB nickname lol ;) – Nikita 웃 Jul 29 '18 at 14:02
  • 1
    Well, it seems like FileSeek doesn't match `m` mulltiline, but this tool https://sourceforge.net/projects/grepwin/ does and seems to work beautifully with your regex, so thanks again! – Nikita 웃 Jul 29 '18 at 14:13
  • 1
    @CM웃 welcome, I thought of `m` because your pattern uses `^` for line start. Fileseek seems to use `C#` regex flavor. Without any flags the last idea I could think of would be [`(?<![^\n])(?:(?!\/\/)[^\r\n])*while[^\r\n]+[\s=(]each[\s(](?!(?:(?!\/\*)[\S\s])*?\*\/)`](https://regex101.com/r/YPOJcH/5). Great you got it going however :) and happy you like my nickname of course (: – bobble bubble Jul 29 '18 at 14:18