-1

I am trying to parse String that has the following patterns:

  • a2[u]
  • 3[rst]5[g]
  • 3[r2[g]]

I want to extract these strings into following tokens:

  • 2 [u]
  • 3 [rst], 5 [g]
  • 2 [r, 3 [r2[g]] (nested groups)

I am using the following Pattern and Code:

Pattern MY_PATTERN = Pattern.compile("(\\d+)\\[(.+)\\]");
String input = "3[rst]5[g]";
Matcher m = MY_PATTERN.matcher(input);
while(m.find()) {
    System.out.println(m.group(1) + " " + m.group(2));
}

However, it matches to last occurrence of ] instead of the first and that results in an unexpected results. If I change the pattern to (\\d+)\\[(\\w+)\\], it works but fails for 3[r2[g]]. What changes do I need to make so that it doesn't count the whole string as one match?

Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102

1 Answers1

-1

Looks like you need to add a quantifier to the .+

As it stands the . will eat the whole string and then only match on the last ]. Add a reluctant quantifier ? to the .+ so make the regex (\\d+)\\[(.+?)\\] and see how far you get...

Michael Wiles
  • 20,902
  • 18
  • 71
  • 101