0

My program is trying to match strings surrounded by curly braces recursively - i.e when given {example1 {example2}} {example3} {example4} it will match example1 {example2}, example3 and example4. I tried using \{.+\}, but that will match example1 {example2}} {example3} {example4. \{.+?\} matches example1 {example2, example3 and example4. Is there an easy way to do what I want with a regex, or alternatively simply in Java? I am not very well-versed on regular expressions, as you can probably tell.

condorcraft110 II
  • 261
  • 1
  • 2
  • 15
  • 1
    It is certainly possible, but it will be an absolute pain to code, since regexes are not meant for this. Just traverse the string and keep track of the number of `{` and `}`. – Keppil Dec 08 '14 at 17:55
  • @AvinashRaj this isn't a duplicate, as I'm trying to extract the strings rather than ensure that brackets are balanced. – condorcraft110 II Feb 09 '15 at 20:12
  • 1
    @Keppil, indeed, that's what I ended up doing - it was surprisingly little code – condorcraft110 II Feb 09 '15 at 20:13

1 Answers1

3

Regexs aren't good parsing things that require state. You could, probably, come up with something that works in some limited scenarios, but not a good regex that could handle a generic case.

You are better off just parsing your string explicitly. Basically, scan the characters one-by-one, and:

  • if the char is { increase the "nesting counter". If nesting counter==1, start a new group, and skip to next char;

  • if the char is }, decrease the counter. If nesting_counter == 0, end the current group, and skip to next char;

  • if nesting_counter>0 append the current char to the group and continue;

DarkDust
  • 90,870
  • 19
  • 190
  • 224
Dima
  • 39,570
  • 6
  • 44
  • 70