0

I still don't seem to be able to get negative lookbehind. I have a situation where I want to be able to match all lines that have a certain string in them, iff they don't have a certain other string anywhere before it. I want to be able to find all lines that occur w/"_view" that do not have "ora" behind them. So "blahblahorablahblah_view" should not match, but "blahblah_view" should. I've tried mating (?<!ora) with _view but it always hits on something w/that has anything that isn't "ora" before "_view". [^(ora)] also doesn't seem to get me what I want.

I also tried learning from Perl: Matching string not containing PATTERN but that didn't get me anywhere. (It doesn't seem to mix positive and negative matches the way I want)

I'm also using https://regex101.com understanding that it is a robust and general tool for diagnosing regexes.

I'm not using Perl or Java but an IDE (PhpStorm), so what ever applies to grep should be good enough.

Community
  • 1
  • 1
Opux
  • 702
  • 1
  • 10
  • 30
  • What IDE are you using? Also, why use lookbehind? Regex101 does not support an infinite width lookbehind as the site supports PCRE/JS and Python regex flavors. This is only supported in some few regex flavors. You may use `^(?!.*oca.*_view).*_view` based on a look*ahead*. – Wiktor Stribiżew Jun 17 '16 at 21:45
  • Could you format your question and add the IDE you use? – Casimir et Hippolyte Jun 17 '16 at 21:47
  • @Jan: this one doesn't ensure that "oca" is before "view", it can be anywhere in the string. – Casimir et Hippolyte Jun 17 '16 at 21:48
  • @CasimiretHippolyte: Right you are, [**`^(?:(?!oca).)*view.*$`**](https://regex101.com/r/nI6wV3/3) – Jan Jun 17 '16 at 21:53

1 Answers1

4

At least two ways are possible:

The one that uses lookarounds:

^(?!.*ora.*_view).*_view.*

(easy to write but not efficient, because it may cause a lot of backtracking)

The one that uses negated character classes:

^[^o_]*(?:o(?!ra)[^o_]*|_(?!view)[^o_]*)*_view.*

or the version with possessive quantifiers (if available):

^[^o_]*+(?:o(?!ra)[^o_]*|_(?!view)[^o_]*)*+_view.*

or the version that emulates possessive quantifiers (if not available):

^(?=([^o_]*))\1(?=((?:o(?!ra)[^o_]*|_(?!view)[^o_]*)*))\2_view.*

Except if your IDE uses the .net regex engine (that allows variable length lookbehinds) or at least the Java regex engine (that allows limited variable length lookbehinds), there's no way to use a lookbehind here.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Meh... I'd rather do the backtracking rather than comb through that yarn. But that worked great, thanks. Care to explain how you got that? I would have never figured that out. I can understand the start-of-line anchor, I understand the grouping, the lookahead and the greedy `.*`, but what I don't get is why `_view` appears in the grouping and then again outside of it. – Opux Jun 20 '16 at 13:43
  • @Opux: "view" 'appears in the grouping' to prevent an eventual '_' to be the start of the substring `_view` until `_view` is reached (doing this allows to use a greedy quantifier instead of a non-greedy quantifier that is slower). `(?!...)` is a negative lookahead and means *not followed by*. – Casimir et Hippolyte Jun 20 '16 at 13:48