What is a c sharp option to split this string:
"['A','B', ''],['A','D', 'F'],['A','G', 'G']"
into list of strings:
"['A','B', '']"
"['A','D', 'F']"
"['A','G', 'G']"
You're better off writing a simple parser than trying to match balanced text with a regular expression:
var str = "['A','B', ''],['A','D', 'F'],['A','G', 'G']";
var topLevelLists = new List<string>();
var arrStart = -1;
var nesting = 0;
for (int i = 0; i != str.Length; ++i) {
if (str[i] == '[') {
if (nesting == 0) {
arrStart = i;
}
++nesting;
}
else if (str[i] == ']') {
if (nesting <= 0) {
// Error, ']' without matching '[' at i
break;
}
--nesting;
if (nesting == 0) {
topLevelLists.Add(str.Substring(arrStart, i - arrStart + 1));
}
}
}
if (nesting > 0) {
// Error, unmatched '[' at arrStart
}
// topLevelLists => [ "['A','B', '']", "['A','D', 'F']", "['A','G', 'G']" ];
You can use this regex: (?<=\]),
to split by comma which ispreceded by ]
.
The code is:
String input = "['A','B', ''],['A','D', 'F'],['A','G', 'G']";
String pattern = @"(?<=\]),";
var split = Regex.Split(input, pattern);
If your separator is ],
(comma comes after the the bracket ]
) you can use a tricky way:
var parts = string.Join("]" + char.MaxValue, input
.Split(new[] {"],"}, StringSplitOptions.None))
.Split(char.MaxValue);
This approach simply replace all commas that comes after square brackets with a temporary char (char.MaxValue
in this case) and then Split
the string by that charachter.
The "pure Regex" answer is like this:
string str = "['A','B', ''],['A','D', 'F'],['A','G', 'G']";
string[] strs =
Regex.Matches(str, @"(\[.*?\])")
.OfType<Match>()
.Select(m => m.Groups[0].Value)
.ToArray();
which is more tolerant of different - or even mixed - separators between the bracketed groups, such as comma-space or space rather than just a comma. If your input string is well-defined then this won't be an issue, but I prefer to be able to handle inputs which might come from different sources and might not quite conform.