0

Related: https://stackoverflow.com/a/2910549/194031

I have a string like:

"abc defgh <!inc(C:\my files\abc.txt)!>kdi kdkd<!inc(C:\my files\abc.txt)!>"

and I want to get:

["abc defgh ", "C:\my files\abc.txt", "kdi kdkd", "C:\my files\abc.txt"]

Also, I don't want

"abc <!inc(C:\my files\abc.txt adf" (missing end bracket) 

to get split.

Based on the related question and other similar answers, I need to use look aheads, but I can't figure out how to use them while accomplishing removing the tags and not splitting if part of the tags are missing.

Community
  • 1
  • 1
Chad
  • 3,159
  • 4
  • 33
  • 43

2 Answers2

2

This might help you get started. You'll probably need to tailor it some more.

Regex.Split("...", @"<!inc\((?=.*?\)!>)|(?<=<!inc\(.*?)\)!>");

Expression break down

<!inc\(
(?=.*?\)!>)    // this is the positive lookahead to make sure that the string ')!>`
               // exists before counting this as a match
|
(?<=<!inc\(.*?) // positive look behind to make sure '<!inc(' shows up before
\)!>         
Mike Park
  • 10,845
  • 2
  • 34
  • 50
2

This is your regex

<!inc\((?=[^)]+\)!>)|(?<=<!inc\([^)]+)\)!>

it splits on (and removes) every <!inc( if and only if it has a matching )!> (and vice versa).

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Thanks Tomalak! This almost works for all cases, except that when I go to multiline input, it no longer splits properly. Singleline, it was fine though. Sorry, I didn't mention multiline earlier, I didn't realize it might affect the answers. I also tried using the RegexOptions.Multiline, but that didn't help. – Chad May 22 '12 at 16:25
  • @Chad Multiline or singleline *should* not affect this regex, at least I can't see how. – Tomalak May 22 '12 at 16:43