1

I have this type of subtring

string 1
{
    string 2
    string 3
    {
        string 4
        string 5
    }
    string 6
    {
        string 7
        string 8
    }
    string 9
    {
        string 10
        string 11
        string 12
        {
            string 13
            string 14
        }
        string 15
    }
}
string 16
string 17

so basically i have java class type of structure
and now i want a piece of code which can get me following substrings(SS#)
SS1:

        string 4
        string 5

SS2:

        string 7
        string 8

SS3:

            string 13
            string 14

SS4:

string 16
string 17

SS5:

        string 10
        string 11
        string 12
        {
            string 13
            string 14
        }
        string 15

SS6:

    string 2
    string 3
    {
        string 4
        string 5
    }
    string 6
    {
        string 7
        string 8
    }
    string 9
    {
        string 10
        string 11
        string 12
        {
            string 13
            string 14
        }
        string 15
    }

so basically i want a piece of code that can get me various parts(function, classes, but not any loops) of string(java class) into different substrings...
i read this
Regex to get string between curly braces "{I want what's between the curly braces}"
but it only gets me data between a pair of '{', and '}' without counting the '{' that have come after the first.
i don't the full code, but some direction into how to proceed???

Community
  • 1
  • 1
Gaurav Bansal
  • 201
  • 2
  • 11
  • Because of the arbitrarily nested braces, you'll need a recursive regex solution, and Java doesn't support them. So either use a recursive descent parser or a different language (.NET, Perl,...) that does support them. – Tim Pietzcker Sep 13 '12 at 05:52

4 Answers4

2

Though this is not perfectly done using RegEx, it's always better to use a stack for that.

But it only requires a RegEx solution then it might work (not always):

(?is)\{[^}]*?\}(?=.*?\})

Explanation

<!--

    (?is)\{[^}]*?\}(?=.*?\})

    Match the remainder of the regex with the options: case insensitive (i); dot matches newline (s) «(?is)»
    Match the character “{” literally «\{»
    Match any character that is NOT a “}” «[^}]*?»
       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
    Match the character “}” literally «\}»
    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*?\})»
       Match any single character «.*?»
          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
       Match the character “}” literally «\}»
    -->
Cylian
  • 10,970
  • 4
  • 42
  • 55
  • @Tim: Just to ensure that it should match only nested sub-string, not the outer most, as OP mentioned. – Cylian Sep 13 '12 at 06:03
0

I dont know a regex for this but there is an another solution i can suggest for this: I'm just writing the sudo code:

  1. scan character of the given input string
  2. if char is { push char position to stack,
  3. else if char is } pop position from stack and take substring(poped_postion, current_position) as SS#
  4. goto 1 (scan next character until there's characters left in string)
PC.
  • 6,870
  • 5
  • 36
  • 71
0

It will be very difficult to do this with regular expressions. I'd suggest that you split this strcture by new line and with a few simple rules to create a data structure of HashMap-s . String 2 will be the key but it will have no value. String3 will be the next key and its value will be the thing in the curly braces below starting with row that begins with { and ending with line that begins with }.

Desislav Kamenov
  • 1,193
  • 6
  • 13
-1

Try this regex for the match of {string},

\{(.*)\}
kushalbhaktajoshi
  • 4,640
  • 3
  • 22
  • 37