2

If I have a series of strings that have this base format:

"[id value]"//id and value are space delimited.  id will never have spaces

They can then be nested like this:

[a]
[a [b value]]
[a [b [c [value]]]

So every item can have 0 or 1 value entries.

What is the best approach to go about parsing this format? Do I just use stuff like string.Split() or string.IndexOf() or are there better methods?

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Shawn
  • 859
  • 1
  • 14
  • 23

4 Answers4

2

A little recursion and split would work, the main point is use recursion, it'll make it so much easier. Your input syntax looks kind of like LISP :)

Parsing a, split, no second part. done.
Parsing a [b value]. has second part, go to the beginning.
...

You get the idea.

dutt
  • 7,909
  • 11
  • 52
  • 85
2

there is nothing wrong with split and indexof methods, they exist for string parsing. Here is a sample for your case:

        string str = "[a [b [c [d value]]]]";

        while (str.Trim().Length > 0)
        {
            int start = str.LastIndexOf('[');
            int end = str.IndexOf(']');

            string s = str.Substring(start +1, end - (start+1)).Trim();
            string[] pair = s.Split(' ');// this is what you are looking for. its length will be 2 if it has a value

            str = str.Remove(start, (end + 1)- start);
        }
Ali YILDIRIM
  • 156
  • 3
  • `Split` and `IndexOf` exist for (advanced) string parsing insofar as shotguns exist for shooting yourself in the foot. ;-) But I actually like your code and it should work as long as the value doesn’t contain spaces (although it is **very** inefficient). – Konrad Rudolph Oct 12 '10 at 09:01
1

Regex is alway a nice solution.

string test = "[a [b [c [value]]]";
Regex r = new Regex("\\[(?<id>[A-Za-z]*) (?<value>.*)\\]");
var res = r.Match(test);

Then you can get the value (which is [b [c [value]] after the first iteration) and apply the same again until the match fails.

string id = res.Groups[1].Value;
string value = res.Groups[2].Value;
testalino
  • 5,474
  • 6
  • 36
  • 48
  • 1
    Regex is not always a nice solution. "Oh, I can solve this problem with regex" - now you have two problems. – Restuta Oct 12 '10 at 07:56
  • Well, what is your problem (or even two) with the solution? I think it is clearer than any split operation. – testalino Oct 12 '10 at 08:04
  • You think, but not other developers who will maintain this. Split is not good too. – Restuta Oct 12 '10 at 08:16
  • If it's nested in more ways than straight top down you will get in trouble. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Jonas Elfström Oct 12 '10 at 09:27
0

Simple split should work For every id,there is one bracket [
So when you split that string you have n-brackets so n-1 id(s) where the last element contains the value.

Myra
  • 3,646
  • 3
  • 38
  • 47