0

I would like to use (pure) regex to match strings that do not contain the word word.

However, I would not like to use lookaround, balancing groups, or that kind of stuff.

If it is impossible, then can we match strings that do not start with word instead?


Examples

  • word should not match.
  • wor should match.
  • wore should match.
Community
  • 1
  • 1
Kenny Lau
  • 457
  • 5
  • 13
  • Try something like wor[^d]? – SMA May 08 '16 at 15:37
  • Lookaround was added because plain regexes can't handle what you're trying to do. – Jonathan Leffler May 08 '16 at 15:37
  • Then just match strings that do not **start** with `word`, as stated in the question. – Kenny Lau May 08 '16 at 15:37
  • I would also appreciate it if you could prove that (pure) regex cannot do that. – Kenny Lau May 08 '16 at 15:38
  • @SMA: but it should match 'this' and 'that' too — they don't contain 'word'. – Jonathan Leffler May 08 '16 at 15:38
  • 1
    Let's start with what you mean by 'pure regex'. Which dialect do you consider to be pure enough to use? POSIX BRE? ERE? PCRE except for lookarounds? – Jonathan Leffler May 08 '16 at 15:39
  • You know, `^a+a*a?(?:ab|cd)$` – Kenny Lau May 08 '16 at 15:40
  • 1
    With the non-matching group there, your example looks like PCRE, which *does* support lookarounds. – tripleee May 08 '16 at 16:32
  • You might find useful information at http://www.regular-expressions.info/lookaround.html which says, in part: _Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a 'q' not followed by a 'u'. Negative lookahead provides the solution: `q(?!u)`._ (And 'character classes' was a link to a tutorial.) – Jonathan Leffler May 08 '16 at 16:55
  • Or without the non-capturing group, pretty close to pre-POSIX `egrep`, i.e. proto-ERE. – tripleee May 08 '16 at 17:04
  • As @tripleee mentioned (and deleted his post), you can do the permutations `\b(?:[^w]|wor(?:$|[^d])|wo(?:$|[^r])|w(?:$|[^o]))+` however, it still needs some anchor or _boundary_ to stop it from matching w`ord` if you get my drift.. –  May 09 '16 at 01:11

1 Answers1

2

use this pattern

^.*\bword\b.*$|(.+)

match strings that has word first, then match and capture strings that don't
Demo


Depending on your engine, you could use this pattern

^.*\bword\b.*$(*SKIP)(*F)|(.+)  

Demo

alpha bravo
  • 7,838
  • 1
  • 19
  • 23