1

I'd like to use a regex to find an exact string, but not if it's part of a comment, as designated by //.

So for example, in the string:

hello apple apples // eat an apple

It should match the first apple but not the second or third.

So, I think the regex would be something like this. It would find the string with word breaks around it, but not if the // is behind it:

(?<!\/\/)\bapple\b

The problem with negative look-behind in this case is that it only looks immediately next to the word. I'd need it to look farther back, to make sure the comment symbol does not exist earlier in the string.

twasbrillig
  • 17,084
  • 9
  • 43
  • 67
  • 1
    `(?<!//.*)\bapple\b` with [Python regex package](http://stackoverflow.com/questions/11640447/regexps-variable-length-lookbehind-assertion-alternatives) could also work. – bobble bubble Jan 19 '16 at 00:48

2 Answers2

4

this pattern will catch what you want in the first sub-pattern

\/\/.*|\b(apple)\b

Demo

alpha bravo
  • 7,838
  • 1
  • 19
  • 23
  • 1
    By using `\/\/.*`, are you forcing Python to find any mention of apple in a comment first so that it won't be matched again when looking for `\b(apple)\b`? Because that's a brilliant approach that I would have never thought of. – tblznbits Jan 19 '16 at 00:46
  • 1
    not necessarily find comment first, but find and CAPTURE what you want, find but don't capture what you don't want. – alpha bravo Jan 19 '16 at 00:52
  • I agree. This is a very clever answer, thank you! It even works in the other direction too. `.*\/\/|\b(apple)\b` would get you the strings that ARE present in the commented section. – twasbrillig Jan 19 '16 at 05:06
0

I think you just need to escape your comment for the lookbehind assertion;

    (?<!\/\/)\b(apple)\b ## doesn't work, don't use this.

Try it -- regex101.com

Jonathan Carroll
  • 3,897
  • 14
  • 34
  • Thanks, but I don't think this works. Try putting a `g` in the modifier field, and you'll see that it detects the last `apple` in the string. https://regex101.com/r/rG7aH9/1 ...but, I did update the question to escape the slashes. – twasbrillig Jan 19 '16 at 01:05
  • Well, you didn't say you wanted it to find more than one. @alpha bravo has the right solution either way. – Jonathan Carroll Jan 19 '16 at 01:56
  • No, that's my point, it's not supposed to match the last `apple` in the string, but does match it. "It should match the first apple but not the second or third." – twasbrillig Jan 19 '16 at 03:57
  • You're right, just `\b(apple)\b` finds only the first when not using the `g` modifier, and matches the comment when the first isn't present. Ignore my answer entirely, it's wrong. – Jonathan Carroll Jan 19 '16 at 04:01