0

I have the following Regex:

public static Regex regex = new Regex( @"(?:\s+(?<statement>(?:[\w./]+)?\s*(?:(?:With|Without)\s*(?:[\w./]+))?)\s*(?:$|\s+AND))+(?<remainder>.*)");

For the string " Tom With Jane AND Mike Without Anne AND" I can capture both "Tom With Jane" and "Mike Without Anne" as statements. Now I'd like to capture the last "AND" in the "remainder" group since it is not followed by another statement. How can I do that? Here's the code that I'm using:

class Program {
public static Regex regex = new Regex( @"(?:\s+(?<statement>(?:[\w./]+)?\s*(?:(?:With|Without)\s*(?:[\w./]+))?)\s*(?:$|\s+AND))+(?<remainder>.*)" );
static void Main( string[] args ) {
  var s = " Tom With Jane AND Mike Without Anne AND";
  var match = regex.Match( s );
  var statements = match.Groups["statement"];
  var remainder = match.Groups["remainder"];
}

}

Steve R
  • 13
  • 2

2 Answers2

2

It's maybe a bit fancy but I think using .NET Balancing Groups (MSDN) allows to create a clean solution that is easy to extend:

(?<statement>(?<word>\w+)+\s+(With|Without)\s+(?<-word>\w+)+(?(word)(?!)))|(?<statement>\sAND\s)|(?<remainder>\sAND$)

Demo

You can test the pattern online at regexstorm.net/tester

enter image description here

wp78de
  • 18,207
  • 7
  • 43
  • 71
0

Why don't you split the string using \s*AND\s*?

string s = " Tom With Jane AND Mike Without Anne AND";
string[] ss = Regex.Split(s.Trim(), @"\s*AND\s*");

will give you

new string[] { "Tom With Jane", "Mike Without Anne", "" }

If you want to avoid matching names that contain "AND" and still match "AND"s at the end of string, you could add a word-boundary constraint: \s*\bAND\b\s*.

sshine
  • 15,635
  • 1
  • 41
  • 66