2

I need some regular expressions for "contain" and "do not contain". Normally I'd just write:

Contains : ((.*WORD_A.*))$ and Does Not Contain : (^((?!WORD_A).)*)$

This works fine if used alone, but I want to write something that can detect sth. like "Contains word A and Word B" (order not relevant!) and "Contains word A, but not Word B).

Basically I want that the user can make a statement like this "Starts with word A, Contains word B, but not C and/or ends with D" and the program returns true/false. The best thing would be to just append the regular expressions. Is this possible? I can't figure it out.

Florian Baierl
  • 2,378
  • 3
  • 25
  • 50
  • I believe you would need to just loop over the list of words, and not try to do each one all at once. – Jonathon Sep 05 '13 at 13:06
  • 1
    The second answer about the use of look aheads might help you: http://stackoverflow.com/questions/469913/regular-expressions-is-there-an-and-operator – mpellegr Sep 05 '13 at 13:09
  • Thanks. Looping over the list is not an option, since it's huge and I need to watch the performance, but thanks anyway. – Florian Baierl Sep 05 '13 at 13:17

1 Answers1

4

For your example, I'd use lookahead assertions like this:

^WORD_A(?=.*WORD_B)(?!.*WORD_C).*WORD_D$

You can always add more conditions if you want (just add another lookahead). For example, if you want to match any string that contains WORD_A and WORD_B and does not contain WORD_C nor WORD_D:

^(?=.*WORD_A)(?=.*WORD_B)(?!.*WORD_C)(?!.*WORD_D)
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • This almost works perfectly, but why does: (?=.*an)(?!.*This) match "This is an Info"? Shouldn't the negative look-ahead negate that? *edit: It works with anchors. Thank you very much, I think this works for me. :) – Florian Baierl Sep 05 '13 at 13:10
  • Note. there is the case were two "words" are merged. For example begins with "then", ends in "end", and you are searching "thend". It begins with then and ends in end, but `^WORD_A.*WORD_D$` would fail. – Jonathon Sep 05 '13 at 13:13
  • 1
    @JonathonWisnoski: In that case, you can do everything with lookaheads (as in my more general, second example). The first example was meant for the particular use case outlined by Florian in his question. – Tim Pietzcker Sep 05 '13 at 13:18