-1

I used regular expression a little bit already but now I've stumbled upon a situation, where I don't really know how to do it, and all the online tutorials I read didn't cover my situation. The problem reads quite easy to solve, and it is probably easy to solve, but I didn't find a solution yet.

I need to parse a string, that is composed of short strings combined with the boolean operators AND, OR and NOT, brackets are allowed, whitespaces shouldn't matter, case insensitive.

For instance

xxxx AND  (yyyy or zzzz) and not  qqqq

Right now I'd like to find all the strings, that have a NOT infront of it. For that I created this regex pattern

"NOT\s+[\w]+"

(The literal NOT followed by at least one whitespace, followed by at least one letter/digit.)

This would give me "not qqqq" in the above example.

But, if the user happens to write

xxxx AND  (yyyy or zzzz) and not not  qqqq

which is a valid string, then my pattern would give me "not not" instead of "not qqqq".

So I need a regex pattern that would give me "literal word NOT followed by whitespace followed by any string but NOT followed by whitespace (or string end)".

I know how to negate single characters, but I didn't find any working examples that negate whole words, at least not when I test it in regexstorm.net

I use .NET.

Nostromo
  • 1,177
  • 10
  • 28
  • I'll say that the correct way to analyze it is to capture the whole *not not qqqq*, otherwise you have a dangling *not* somewhere – xanatos Jan 23 '21 at 09:16
  • And note that you could perhaps have aaa *xxx and not (yyy or zzz)* – xanatos Jan 23 '21 at 09:17
  • Yes, I know, I'd like to capture the terms one by one and replace them with placeholders, until I only have simple terms left. So, when the string is "not not qqqq" I'd capture "not qqqq" first and replace it with a place holder, giving me "not ", after that the next matching would give me "not ". – Nostromo Jan 23 '21 at 09:37

1 Answers1

1

Using a case insensitive match, you could use

\bNOT\s+(?!NOT\b)\w+
  • \bNOT\s+ A word boundary, match NOT and 1+ whitespace chars
  • (?!NOT\b) Negativelookahead, assert that NOT is not at the right
  • \w+ Match 1+ word characters

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70