0

I would like to split a string using multiple chars to split upon. For example, consider spin text format:

This is a {long|ugly|example} string 

I would want to parse this string and split it on the "{", "|", and "}" chars

myString.Split('|','{','}')

Now I have tokens to play with, but what I would like is to retain the info about which char was used to split each piece of the array that is returned.

Any existing code that can do something like this?

Brady Moritz
  • 8,624
  • 8
  • 66
  • 100
  • Can you just make an Array of Tokens that you will use when doing the Split..? – MethodMan Nov 14 '12 at 16:57
  • 1
    What's your final objective? You're not writing an expression evaluator by chance? – Paul Sasik Nov 14 '12 at 16:57
  • `Split` gives you an array of strings `{"This is a ", "long", "ugly", "example", " string"}`. What you want to get instead? – Sergey Berezovskiy Nov 14 '12 at 17:04
  • Textbook [XY Problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). **Why** do you want to do this? Your example will give you strings, not tokens. Are you trying to parse JSON? Logic expressions? There are many packages that do that already. What are you going to do with these "tokens" that are not actually tokens? – Dour High Arch Nov 14 '12 at 17:54
  • the example, strictly for example's sake, is "spin text" format. a good example of a very simple text formatting that is is overkill for using fancier parsers with, but still more complex than one would want to just run a split function on. – Brady Moritz Nov 14 '12 at 22:29
  • I added a followup question here- just a split function that leaves the split chars in the resulting array. http://stackoverflow.com/questions/13388648/net-split-with-split-characters-retained – Brady Moritz Nov 14 '12 at 22:42

1 Answers1

1

I would tend toward using regular expressions on this. Then you could use match groups to track what matched what.

Check out this regular expression tester. Use your test data and this regular expression pattern:

([^{|}]+)([{|}]?)

This effectively splits the string into 5 matches. Each match contains 2 groups: the split string and the character it split on.

The code to run this would be something like this:

MatchCollection m = Regex.Matches("This is a {long|ugly|example} string ",@"([^{|}]+)([{|}]?)");

Now the match collection m will contain a Match object for each matched string, which in turn will contain 2 groups, the string and the character which was split upon.

m[0].Groups[0].Value; // 1st split string
m[0].Groups[1].Value; // 1st split character
Slider345
  • 4,558
  • 7
  • 39
  • 47