2

I need to split next string

value1,value2[2,5],value3[4,7],value4,value5[7,4]

After split i should have next array:

value1
value2[2,5]
value3[4,7]
value4
value5[7,4]

I can't split on comma, and I don't want complicate logic to much. I would like to have simplest logic for this.

Thanks for help.

edit: my attempt:

  var parts = Regex.Split(line, "/([^,]+\\[[^,\\]]*\\,[^,\\]]*\\])|([^,]+)|(,,)/g");
Raskolnikov
  • 3,791
  • 9
  • 43
  • 88
  • There are a lot of ways to answer your question. Please show what you attempted first. Splitting with `,` would never work here. – Wiktor Stribiżew Jul 22 '16 at 07:53
  • is it always `value` with a number at the end? or is this just an example and there could be any string? – Mong Zhu Jul 22 '16 at 07:55
  • no, does not have to be number. – Raskolnikov Jul 22 '16 at 07:56
  • but always `value`? – Mong Zhu Jul 22 '16 at 07:57
  • 2
    See http://stackoverflow.com/questions/14792931/java-regex-split-comma-separated-list-but-exclude-commas-within-parentheses, replace `(` with `[` and `\(` with `\[`. Oh, you added your effort, great. .NET regex does not support regex delimiters and `/g` modifier. Can the brackets be nested? Like `value1,value2[2,[56,78]]`? – Wiktor Stribiżew Jul 22 '16 at 07:58
  • 1
    Look here, your regex works - `(?[^,]+\[[^,\]]*,[^,\]]*])|(?[^,]+)|(,,)`, I only added the named captures. See [this demo](http://regexstorm.net/tester?p=(%3f%3cval%3e%5b%5e%2c%5d%2b%5c%5b%5b%5e%2c%5c%5d%5d*%2c%5b%5e%2c%5c%5d%5d*%5d)%7c(%3f%3cval%3e%5b%5e%2c%5d%2b)%7c(%2c%2c)&i=value1%2cvalue2%5b2%2c5%5d%2cvalue3%5b4%2c7%5d%2cvalue4%2cvalue5%5b7%2c4%5d). No idea why you use `(,,)` – Wiktor Stribiżew Jul 22 '16 at 08:02
  • I'd use: [`[^][,]+(?:\[[^][]*])?`](http://regexstorm.net/tester?p=%5b%5e%5d%5b%2c%5d%2b(%3f%3a%5c%5b%5b%5e%5d%5b%5d*%5d)%3f&i=value1%2cvalue2%5b2%2c5%5d%2cvalue3%5b4%2c7%5d%2cvalue4%2cvalue5%5b7%2c4%5d) to *match* these values. – Wiktor Stribiżew Jul 22 '16 at 08:16

3 Answers3

5

You can split on ,(?!\d+]):

string st = @"value1,value2[21,5],value3[4,7],value4,value5[7,4]";
var output = Regex.Split(st, @",(?!\d+])").ToList();

Which will output:

value1
value2[2,5]
value3[4,7]
value4
value5[7,4]
Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
  • `.RemoveAll(s => s == ",");` is redundant, just remove the capturing group around `,`. And a note: this won't work if `[...]` can be nested. Also, technically, this still allows splitting with a comma that is not *inside* `[...]` as you only check for a *closing* `]`, not if the comma is preceded wtih a `[`. *Matching* approach would be most appropriate here (IMHO). – Wiktor Stribiżew Jul 22 '16 at 08:04
  • You can get expected result without removing ",". Replace "(,)(?!\d+])" by ",(?!\d+])" – Roman Jul 22 '16 at 08:04
  • @WiktorStribiżew thanks once again. I wasn't aware that the capturing group would keep the captured value – Thomas Ayoub Jul 22 '16 at 08:06
1

Try this:

string input = "value1,value2[2,5],value3[4,7],value4,value5[7,4]";
string pattern = @"(?'value'\w+\d+),?|(?'value'\w+\d+\[\d+,\d+\]),?";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
    Console.WriteLine(match.Groups["value"].Value);
}
fedorqui
  • 275,237
  • 103
  • 548
  • 598
jdweng
  • 33,250
  • 2
  • 15
  • 20
1

It seems to me that splitting here is the wrong approach and it would be easier to understand and maintain if you were to match against the items you're searching for rather than split against commas. As such you could:

IEnumerable<string> values = 
    Regex.Matches(input, @"\w+\d+(\[\d+,\d+\])?").Cast<Match>().Select(m => m.Value)
spender
  • 117,338
  • 33
  • 229
  • 351