3

I have following Java code:

String s2 = "SUM   12  32 42";
Pattern pat1 = Pattern.compile("(PROD)|(SUM)(\\s+(\\d+))+");
Matcher m = pat1.matcher(s2);
System.out.println(m.matches());
System.out.println(m.groupCount());
for (int i = 1; i <=  m.groupCount(); ++i) {
        System.out.println(m.group(i));
}

which produces:

true
4
null
SUM
 42
42

I wonder what's a null and why 12 and 32 are missing (I expected to find them amongst groups).

Artem Pelenitsyn
  • 2,508
  • 22
  • 38

3 Answers3

5

A repeated group will contain the match of the last substring matching the expression for the group.

It would be nice if the regexp engine would give back all substrings that matched a group. Unfortunately this is not supported:

Furthermore groups are a static and numbered like this:

                    0
          _______________________
         /                       \
         (PROD)|(SUM)(\\s+(\\d+))+
         \____/ \___/|    \____/| 
           1      2  |       4  |
                      \________/ 
                           3  
Community
  • 1
  • 1
aioobe
  • 413,195
  • 112
  • 811
  • 826
4

Group X from this part of your regex:

(\\s+(\\d+))+
|          |
+----------+--> X

will first match 12, then 32 and finally 42. Each time X's value gets changed, and replaces the previous one. If you want all values, you'll need a Pattern & Matcher.find() approach:

String s = "SUM   12  32 42 PROD 1 2";
Matcher m = Pattern.compile("(PROD|SUM)((\\s+\\d+)+)").matcher(s);
while(m.find()) {
    System.out.println("Matched : " + m.group(1));
    Matcher values = Pattern.compile("\\d+").matcher(m.group(2));
    while(values.find()) {
        System.out.println("        : " + values.group());
    }
}

which will print:

Matched : SUM
        : 12
        : 32
        : 42
Matched : PROD
        : 1
        : 2

And you see a null printed because in group 1, there's PROD, which you didn't match.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
-1
I wonder what's a null

Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().

http://download.oracle.com/javase/1.5.0/docs/api/java/util/regex/Matcher.html#group%28int%29

the string given does not matches the entire pattern.

Zohaib
  • 7,026
  • 3
  • 26
  • 35