5

I'm learning regex with lookaround (lookbehind and lookahead) feature but I cannot use .* or .+ quantifiers in lookbehind (but I can on lookahead).

The regex I'm trying to fix is the following:

(?<!yellow.*)blue(?=.*brown)

The idea is to match lines that don't have yellow but has blue only if brown exists after blue. Here are some samples:

yellow blue brown                    // shouldn't match
f blue brown                         // should match
sdff blue brown                      // should match
asdf  f blue c                       // shouldn't match
yellow blue fblue b f brown          // shouldn't match

Here is my test:

http://regex101.com/r/fY4kI9/5

The error I get is:

. * Lookbehinds need to be zero-width, thus quantifiers are not allowed

Do you know how I can fix that?

Federico Piazza
  • 30,085
  • 15
  • 87
  • 123
  • 1
    Which regex? Many (most?) regexes don't support variable length lookbehind. You can cheat some of them by using `{0, 100}` or the like. – Boris the Spider Jul 01 '14 at 21:08
  • You may need a compound regular expression... – Mr. Polywhirl Jul 01 '14 at 21:08
  • 1
    What language/tool are you using here? – anubhava Jul 01 '14 at 21:09
  • @anubhava I want the regex only not using any particular language. – Federico Piazza Jul 01 '14 at 21:12
  • 2
    @Fede that makes absolutely no sense. Regex is implemented by engines, and they have different capabilities. Saying you want the regex only is nonsensical. Even the link you provide to the tester has a "flavours" selector down the side that changes the engine being simulated. – Boris the Spider Jul 01 '14 at 21:14
  • 1
    @Fede: Different languages have different implementations; some support infinite-length lookbehinds, but some don't. If the regex flavor you're using *does* support it, the potential answers might change, too. That's why we're asking what platform you're using the regex on. – Amal Murali Jul 01 '14 at 21:14
  • @AmalMurali good point, sorry for that. The idea is to implement it in java. I'll add the tag so – Federico Piazza Jul 01 '14 at 21:19

1 Answers1

2

You can use this regex without using variable length lookbehind but still getting the same functionality:

.*yellow.*(*SKIP)(*F)|^.*\bblue\b(?=.*brown).*$

Working Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Nice. But wouldn't this still match `sdff blue brown yellow`? – Amal Murali Jul 01 '14 at 21:27
  • But that should match right? OP just doesn't want `yellow` before `blue` I think. Or did I misread it? – anubhava Jul 01 '14 at 21:28
  • Also, this only matches ``blue`` but I think OP wants to match the whole line – badger5000 Jul 01 '14 at 21:31
  • "*The idea is to match lines that don't have yellow*" — As I understand it, the OP wants to match lines that doesn't contain `yellow`, and where `blue` is followed by `brown`. – Amal Murali Jul 01 '14 at 21:31
  • Oh that's easy just add `.*` in the end i.e. `yellow.*(*SKIP)(*F)|\bblue\b(?=.*brown).*` – anubhava Jul 01 '14 at 21:32
  • @anubhava needs one more edit to match the start of the line: ``.*yellow.*(*SKIP)(*F)|.*\bblue\b(?=.*brown).*`` (else in ``purple blue brown`` it just matches ``blue brown``) – badger5000 Jul 01 '14 at 21:37
  • @anubhava thatnks for the answer. Can you explain me a little bit more about that regex composition? – Federico Piazza Jul 01 '14 at 21:55
  • `(*F)` is a shorthand for `(*FAIL)` that behaves like a failing negative assertion and is a synonym for `(?!)`. `(*SKIP)` defines a point beyond which the regex engine is not allowed to backtrack when the subpattern fails later. So `(*SKIP)(*FAIL)` together provide a nice alternative of restriction that you cannot have a variable length lookbehind in above regex. – anubhava Jul 01 '14 at 21:58