I'm working with a rather large set of strings I have to process as quickly as possible.
The format is quite fixed:
[name]/[type]:[list ([key] = [value],)]
or
[name]/[type]:[key]
I hope my representation is okay. What this means is that I have a word (in my case I call it Name
), then a slash, followed by another word (I call it Type
), then a colon, and it is either followed by a comma-separated list of key-value pairs (key
=
value
), or a single key.
Name
, Type
cannot contain any whitespaces, however the key
and value
fields can.
Currently I'm using Regex to parse this data, and a split:
var regex = @"(\w+)\/(\w+):(.*)";
var r = new Regex(regex, RegexOptions.IgnoreCase | RegexOptions.Singleline);
var m = r.Match(Id);
if (m.Success) {
Name = m.Groups[1].Value;
Type= m.Groups[2].Value;
foreach (var intern in m.Groups[3].Value.Split(','))
{
var split = intern.Trim().Split('=');
if (split.Length == 2)
Items.Add(split[0], split[1]);
else if (split.Length == 1)
Items.Add(split[0], split[0]);
}
}
Now I know this is not the most optional case, but I'm not sure which would be the fastest:
- Split the string first by the
:
then by/
for the first element, and,
for the second, then process the latter list and split again by=
- Use the current mixture as it is
- Use a completely regex-based
Of course I'm open to suggestions, my main goal is to achieve the fastest processing of this single string.