This oughta do it:
string test = @"This|is|a|pip\|ed|test (this is a pip|ed test)";
string[] parts = Regex.Split(test, @"(?<!(?<!\\)*\\)\|");
The regular expression basically says: split on pipes that aren't preceded by an escape character. I shouldn't take any credit for this though, I just hijacked the regular expression from this post and simplified it.
EDIT
In terms of performance, compared to the manual parsing method provided in this thread, I found that this Regex implementation is 3 to 5 times slower than Jonathon Wood's implementation using the longer test string provided by the OP.
With that said, if you don't instantiate or add the words to a List<string>
and return void instead, Jon's method comes in at about 5 times faster than the Regex.Split()
method (0.01ms vs. 0.002ms) for purely splitting up the string. If you add back the overhead of managing and returning a List<string>
, it was about 3.6 times faster (0.01ms vs. 0.00275ms), averaged over a few sets of a million iterations. I did not use the static Regex.Split() for this test, I instead created a new Regex instance with the expression above outside of my test loop and then called its Split method.
UPDATE
Using the static Regex.Split() function is actually a lot faster than reusing an instance of the expression. With this implementation, the use of regex is only about 1.6 times slower than Jon's implementation (0.0043ms vs. 0.00275ms)
The results were the same using the extended regular expression from the post I linked to.