18

I'm playing Regex Golf (http://regex.alf.nu/) and I'm doing the Abba hole. I have the following regex that matches the wrong side entirely (which is what I was trying to do):

(([\w])([\w])\3\2)

However, I'm trying to negate it now so it matches the other side. I can't seem to figure that part out. I tried:

(?!([\w])([\w])\3\2)

But that didn't work. Any tips from the regex masters?

Lester Peabody
  • 1,868
  • 3
  • 20
  • 42

3 Answers3

24

You can make it much shorter (and get more points) by simply using . and removing unnecessary parens:

^(?!.*(.)(.)\2\1)

It just makes sure that there's no "abba" ("abba" here means 4 letters in that particular order we don't want to match) in any part of the string without having to match the whole word.

Jerry
  • 70,495
  • 13
  • 100
  • 144
  • I know this is old, but could you explain how `?!` works please? More specifically, why `(?!(.)(.)\2\1)` matches everything. – Adi Bradfield May 20 '15 at 08:32
  • 1
    @AdiBradfield `(?!a)a` will never match anything because after `(?! ... )` group there is `a` and `(?!a)` prevents a match if after it, there is `a` (what's inside). Similarly, `(?!a)b` will always match a `b`, because while the `(?!a)` prevents a match if it is followed by `a`, it will never happen because there is a `b` after it. By extension,, `^(?!.*a)` will prevent a match if any line contains `a`. The anchor and the `.*` are important because otherwise, the pattern will start to match after any `a` that might exist (because after that point, there are no more `a` to prevent the match. – Jerry May 20 '15 at 09:12
  • Okay, yeah that makes perfect sense. Thanks for the clarification – Adi Bradfield May 20 '15 at 09:25
  • 1
    @Zikato It is working; the regex simply matches an empty line, not that it matters if you are only checking for a match or not, see http://i.stack.imgur.com/5MzvR.png – Jerry May 26 '16 at 06:54
2

Using the explanation here: https://stackoverflow.com/a/406408/584663

I came up with: ^((?!((\w)(\w)\4\3)).)*$

Community
  • 1
  • 1
Bill
  • 884
  • 6
  • 23
2

The key here turns out to be the leading caret, ^, and the .*

(?! ...) is a look-ahead construct, and so does not advance the regex processing engine.

/(?! ...)/ on its own will correctly return a negative result for items matching the expression within; but for items which do not match (...) the regex engine continues processing. However if your regex only contains the (?! ) there is nothing left to process, and the regex processing position never advances. (See this great answer).

Apparently since the remaining regex is empty, it matches any zero-width segment of a string, i.e. it matches any string.

[begin SWAG]

With the caret ^ present, the regex engine is able to recognize that you are looking for a real answer and that you do not want it to tell you the string contains zero-width components.

[end SWAG]

Thus it is able to correctly fail to match when the (?! ) succeeds.

Community
  • 1
  • 1
inquist
  • 197
  • 4
  • 2
    To make your SWAG more precise, the caret does not allow the engine to find zero length matches that are not at the beginning of the string. – RHH Aug 02 '14 at 02:35