0

This is the string:

String str = "(S(B1)(B2(B21)(B22)(B23))(B3)())";

Content in a son-() may be "", or just the value of str, or like that pattern, recursively, so a sub-() is a sub-tree.

Expected result:

str1 is "(S(B1))"
str2 is "(B2(B21)(B22)(B23))" //don't expand sons of a son
str3 is "(B3)"
str4 is "()"

str1-4 are e.g. elements in an Array

How to split the string?

I have a fimiliar question: How to split this string in Java regex? But its answer is not good enough for this one.

Community
  • 1
  • 1
droidpiggy
  • 15
  • 3

1 Answers1

1

Regexes do not have sufficient power to parse balanced/nested brackets. This is essentially the same problem as parsing markup languages such as HTML where the consistent advice is to use special parsers, not regexes.

You should parse this as a tree. In overall terms:

  • Create a stack.
  • when you hit a "(" push the next chunk onto the stack.
  • when you hit a ")" pop the stack.

This takes a few minutes to write and will check that your input is well-formed.

This will save you time almost immediately. Trying to manage regexes for this will become more and more complex and will almost inevitably break down.

UPDATE: If you are only concerned with one level then it can be simpler (NOT debugged):

List<String> subTreeList = new ArrayList<String>();
String s = getMyString();
int level = 0;
int lastOpenBracket = -1
for (int i = 0; i < s.length(); i++) {
    char c = s.charAt(i);
    if (c == '(') {
        level++;
        if (level == 1) {
            lastOpenBracket = i;
        }
    } else if (c == ')') {
        if (level == 1) {
            subStreeList.add(s.substring(lastOpenBracket, i);
        }
        level--;
    }
}

I haven't checked it works, and you should debug it. You should also put checks to make sure you 

don't have hanging brackets at the end or strange characters at level == 1;

peter.murray.rust
  • 37,407
  • 44
  • 153
  • 217