0

I am trying to build a regular expression but am finding it difficult to do so for this one particular case. I want to return a match only if the string I am trying to parse does not contain a specific substring. For example (searching for the substring "test" with case insensitivity):

"Line one
Line two
Line test three" - Return false.

"Line test one" - Return false.

"Tfest" - Return true.

"Tfest
Tdhf
Line three" - Return true.

I've been able to do this for single line strings using ^((?!message 1).)*$ but I'm not sure about multi-line strings.

PS: I don't want to start a debate on using string operations VS regular expressions. Performance is a concern. The constraint of the question is that the solution must use regular expressions.

Alexandru
  • 12,264
  • 17
  • 113
  • 208
  • `\btest\b` would work, if `IsMatch` then you can just run your code for it contains substring. – abc123 Aug 28 '13 at 15:33
  • @abc123 It does not seem to work for @"Line onetddest (NEWLINE) messadge 1dwd" – Alexandru Aug 28 '13 at 15:34
  • 1
    Try using the flag [`RegexOptions.SingleLine`](http://msdn.microsoft.com/en-us/library/443e8hc7(vs.71).aspx). – Jerry Aug 28 '13 at 15:35
  • Thanks Jerry, this worked! Post your answer as the solution. – Alexandru Aug 28 '13 at 15:39
  • "_Performance is a concern._" And you're under the impression regex is _fast_? – Grant Thomas Aug 28 '13 at 15:47
  • @GrantThomas http://stackoverflow.com/questions/12428776/why-are-c-sharp-compiled-regular-expressions-faster-than-equivalent-string-metho But again, I won't entertain this debate. Please use other threads to debate this clause. – Alexandru Aug 28 '13 at 15:59
  • Applying that link of a specific type problem doesn't necessarily generally apply; I'm in no mind to debate this, to be fair, it was an honest question. If you thought this specific to your case though, you might have added it to the question in the clause. – Grant Thomas Aug 28 '13 at 16:02
  • 1
    Instead of using: `if(IsMatch(Str,negativePattern)` logic, it would be much faster to use: `if(!IsMatch(Str,positivePattern)`. – ridgerunner Aug 28 '13 at 16:14

2 Answers2

1

The problem is that the dot doesn't match newlines.

As Jerry suggests it, you can use the singleline mode (or dotall mode) to allow the dot to match newlines.

An other way consists to avoid the dot, example for the word "test":

^(?>[^t]+|\Bt|t(?!est\b))*$

Note that this way is more performant since the lookahead is tested only when there is a "t" preceded by a word boundary. ( vs on each character with ^((?!\btest\b).)*$)

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • I like your search method a lot. This makes a lot of sense. Thanks for taking the time to add this useful information in! – Alexandru Aug 28 '13 at 16:56
1

You can make use of the flag RegexOptions.SingleLine to make the dot you're using in your regex match newlines:

new Regex(@"^((?!message 1).)*$", RegexOptions.SingleLine | RegexOptions.IgnoreCase);
Jerry
  • 70,495
  • 13
  • 100
  • 144